首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Linkage disequilibrium (LD) mapping has been applied to many simple, monogenic, overtly Mendelian human traits, with great success. However, extensions and applications of LD mapping approaches to more complex human quantitative traits have not been straightforward. In this article, we consider the analysis of biallelic DNA marker loci and human quantitative trait loci in settings that involve sampling individuals from opposite ends of the trait distribution. The purpose of this sampling strategy is to enrich samples for individuals likely to possess (and not possess) trait-influencing alleles. Simple statistical models for detecting LD between a trait-influencing allele and neighboring marker alleles are derived that make use of this sampling scheme. The power of the proposed method is investigated analytically for some hypothetical gene-effect scenarios. Our studies indicate that LD mapping of loci influencing human quantitative trait variation should be possible in certain settings. Finally, we consider possible extensions of the proposed methods, as well as areas for further consideration and improvement.  相似文献   

2.
Moskvina V  Schmidt KM 《Biometrics》2006,62(4):1116-1123
With the availability of fast genotyping methods and genomic databases, the search for statistical association of single nucleotide polymorphisms with a complex trait has become an important methodology in medical genetics. However, even fairly rare errors occurring during the genotyping process can lead to spurious association results and decrease in statistical power. We develop a systematic approach to study how genotyping errors change the genotype distribution in a sample. The general M-marker case is reduced to that of a single-marker locus by recognizing the underlying tensor-product structure of the error matrix. Both method and general conclusions apply to the general error model; we give detailed results for allele-based errors of size depending both on the marker locus and the allele present. Multiple errors are treated in terms of the associated diffusion process on the space of genotype distributions. We find that certain genotype and haplotype distributions remain unchanged under genotyping errors, and that genotyping errors generally render the distribution more similar to the stable one. In case-control association studies, this will lead to loss of statistical power for nondifferential genotyping errors and increase in type I error for differential genotyping errors. Moreover, we show that allele-based genotyping errors do not disturb Hardy-Weinberg equilibrium in the genotype distribution. In this setting we also identify maximally affected distributions. As they correspond to situations with rare alleles and marker loci in high linkage disequilibrium, careful checking for genotyping errors is advisable when significant association based on such alleles/haplotypes is observed in association studies.  相似文献   

3.
We present a Bayesian, Markov-chain Monte Carlo method for fine-scale linkage-disequilibrium gene mapping using high-density marker maps. The method explicitly models the genealogy underlying a sample of case chromosomes in the vicinity of a putative disease locus, in contrast with the assumption of a star-shaped tree made by many existing multipoint methods. Within this modeling framework, we can allow for missing marker information and for uncertainty about the true underlying genealogy and the makeup of ancestral marker haplotypes. A crucial advantage of our method is the incorporation of the shattered coalescent model for genealogies, allowing for multiple founding mutations at the disease locus and for sporadic cases of disease. Output from the method includes approximate posterior distributions of the location of the disease locus and population-marker haplotype proportions. In addition, output from the algorithm is used to construct a cladogram to represent genetic heterogeneity at the disease locus, highlighting clusters of case chromosomes sharing the same mutation. We present detailed simulations to provide evidence of improvements over existing methodology. Furthermore, inferences about the location of the disease locus are shown to remain robust to modeling assumptions.  相似文献   

4.
Transmission-disequilibrium tests for quantitative traits.   总被引:9,自引:3,他引:6       下载免费PDF全文
The transmission-disequilibrium test (TDT) of Spielman et al. is a family-based linkage-disequilibrium test that offers a powerful way to test for linkage between alleles and phenotypes that is either causal (i.e., the marker locus is the disease/trait allele) or due to linkage disequilibrium. The TDT is equivalent to a randomized experiment and, therefore, is resistant to confounding. When the marker is extremely close to the disease locus or is the disease locus itself, tests such as the TDT can be far more powerful than conventional linkage tests. To date, the TDT and most other family-based association tests have been applied only to dichotomous traits. This paper develops five TDT-type tests for use with quantitative traits. These tests accommodate either unselected sampling or sampling based on selection of phenotypically extreme offspring. Power calculations are provided and show that, when a candidate gene is available (1) these TDT-type tests are at least an order of magnitude more efficient than two common sib-pair tests of linkage; (2) extreme sampling results in substantial increases in power; and (3) if the most extreme 20% of the phenotypic distribution is selectively sampled, across a wide variety of plausible genetic models, quantitative-trait loci explaining as little as 5% of the phenotypic variation can be detected at the .0001 alpha level with <300 observations.  相似文献   

5.
Linkage analysis was developed to detect excess co-segregation of the putative alleles underlying a phenotype with the alleles at a marker locus in family data. Many different variations of this analysis and corresponding study design have been developed to detect this co-segregation. Linkage studies have been shown to have high power to detect loci that have alleles (or variants) with a large effect size, i.e. alleles that make large contributions to the risk of a disease or to the variation of a quantitative trait. However, alleles with a large effect size tend to be rare in the population. In contrast, association studies are designed to have high power to detect common alleles which tend to have a small effect size for most diseases or traits. Although genome-wide association studies have been successful in detecting many new loci with common alleles of small effect for many complex traits, these common variants often do not explain a large proportion of disease risk or variation of the trait. In the past, linkage studies were successful in detecting regions of the genome that were likely to harbor rare variants with large effect for many simple Mendelian diseases and for many complex traits. However, identifying the actual sequence variant(s) responsible for these linkage signals was challenging because of difficulties in sequencing the large regions implicated by each linkage peak. Current 'next-generation' DNA sequencing techniques have made it economically feasible to sequence all exons or the whole genomes of a reasonably large number of individuals. Studies have shown that rare variants are quite common in the general population, and it is now possible to combine these new DNA sequencing methods with linkage studies to identify rare causal variants with a large effect size. A brief review of linkage methods is presented here with examples of their relevance and usefulness for the interpretation of whole-exome and whole-genome sequence data.  相似文献   

6.
Where recent admixture has occurred between two populations that have different disease rates for genetic reasons, family-based association studies can be used to map the genes underlying these differences, if the ancestry of the alleles at each locus examined can be assigned to one of the two founding populations. This article explores the statistical power and design requirements of this approach. Markers suitable for assigning the ancestry of genomic regions could be defined by grouping alleles at closely spaced microsatellite loci into haplotypes, or generated by representational difference analysis. For a given relative risk between populations, the sample size required to detect a disease locus that accounts for this relative risk by linkage-disequilibrium mapping in an admixed population is not critically dependent on assumptions about genotype penetrances or allele frequencies. Using the transmission-disequilibrium test to search the genome for a locus that accounts for a relative risk of between 2 and 3 in a high-risk population, compared with a low-risk population, generally requires between 150 and 800 case-parent pairs of mixed descent. The optimal strategy is to conduct an initial study using markers spaced at < or = 10 cM with cases from the second and third generations of mixed descent, and then to map the disease loci more accurately in a subsequent study of a population with a longer history of admixture. This approach has greater statistical power than allele-sharing designs and has obvious applications to the genetics of hypertension, non-insulin-dependent diabetes, and obesity.  相似文献   

7.
Variance component modeling for linkage analysis of quantitative traits is a powerful tool for detecting and locating genes affecting a trait of interest, but the presence of genetic heterogeneity will decrease the power of a linkage study and may even give biased estimates of the location of the quantitative trait loci. Many complex diseases are believed to be influenced by multiple genes and therefore genetic heterogeneity is likely to be present for many real applications of linkage analysis. We consider a mixture of multivariate normals to model locus heterogeneity by allowing only a proportion of the sampled pedigrees to segregate trait-influencing allele(s) at a specific locus. However, for mixtures of normals the classical asymptotic distribution theory of the maximum likelihood estimates does not hold, so tests of linkage and/or heterogeneity are evaluated using resampling methods. It is shown that allowing for genetic heterogeneity leads to an increase in power to detect linkage. This increase is more prominent when the genetic effect of the locus is small or when the percentage of pedigrees not segregating trait-influencing allele(s) at the locus is high.  相似文献   

8.
We report both a recombination event that places the Huntington disease gene proximal to the marker D4S98 and an extended linkage-disequilibrium study that uses this marker and confirms the existence of disequilibrium between it and the HD locus. We also report the cloning of other sequences in the region around D4S98, including a new polymorphic marker R10 and conserved sequences that identify a gene in the region of interest.  相似文献   

9.
A composite-conditional-likelihood (CCL) approach is proposed to map the position of a trait-influencing mutation (TIM) using the ancestral recombination graph (ARG) and importance sampling to reconstruct the genealogy of DNA sequences with respect to windows of marker loci and predict the linkage disequilibrium pattern observed in a sample of cases and controls. The method is designed to fine-map the location of a disease mutation, not as an association study. The CCL function proposed for the position of the TIM is a weighted product of conditional likelihood functions for windows of a given number of marker loci that encompass the TIM locus, given the sample configuration at the marker loci in those windows. A rare recessive allele is assumed for the TIM and single nucleotide polymorphisms (SNPs) are considered as markers. The method is applied to a range of simulated data sets. Not only do the CCL profiles converge more rapidly with smaller window sizes as the number of simulated histories of the sampled sequences increases, but the maximum-likelihood estimates for the position of the TIM remain as satisfactory, while requiring significantly less computing time. The simulations also suggest that non-random samples, more precisely, a non-proportional number of controls versus the number of cases, has little effect on the estimation procedure as well as sample size and marker density beyond some threshold values. Moreover, when compared with some other recent methods under the same assumptions, the CCL approach proves to be competitive.  相似文献   

10.
Case-control studies compare marker-allele distributions in affected and unaffected individuals, and significant results suggest linkage but may simply reflect population structure. For markers with m alleles (m > or = 2), a McNemar-like statistic, I, estimates the level of population association between marker and disease loci. To test for linkage after significant case-control tests, within-family tests are performed. These operate on the contingency table, with i, jth element equal to the number of parents that transmit marker allele Mi and do not transmit marker allele Mi to an affected offspring. The dimension of the table is the number of alleles at the marker locus. Three test statistics have recently been proposed in the literature: Tc compares symmetric pairs of cells (i, j) and (j, i), Tm compares row and column totals for the same marker allele, and a likelihood ratio statistic Tl uses all the cells in the table. In addition, we consider a new statistic, Tmhet, that uses only the heterozygous parents and is approximately chi2 with (m - 1) df. We use a Monte Carlo test to guarantee valid tests and to demonstrate the inferiority of Tc and the equality of Tm and Tl in terms of power. The power of the Tmhet test is close but not always equal to the power of the Tm test. We also show that under the alternative hypothesis of linkage, Tm is approximately noncentral chi2 with (m - 1) df and noncentrality parameter 2NT(1 - 2theta)2I*, when data on single affecteds in NT families are used. If the disease has a low population frequency, then I* is estimated using the case-control statistic I. This offers a basis for choosing sample size, or choosing a marker system.  相似文献   

11.
Historically, most methods for detecting linkage disequilibrium were designed for use with diallelic marker loci, for which the analysis is straightforward. With the advent of polymorphic markers with many alleles, the normal approach to their analysis has been either to extend the methodology for two-allele systems (leading to an increase in df and to a corresponding loss of power) or to select the allele believed to be associated and then collapse the other alleles, reducing, in a biased way, the locus to a diallelic system. I propose a likelihood-based approach to testing for linkage disequilibrium, an approach that becomes more conservative as the number of alleles increases, and as the number of markers considered jointly increases in a multipoint test for linkage disequilibrium, while maintaining high power. Properties of this method for detecting associations and fine mapping the location of disease traits are investigated. It is found to be, in general, more powerful than conventional methods, and it provides a tractable framework for the fine mapping of new disease loci. Application to the cystic fibrosis data of Kerem et al, is included to illustrate the method.  相似文献   

12.
Linkage disequilibrium has been used to help in the identification of genes predisposing to certain qualitative diseases. Although several linkage-disequilibrium tests have been developed for localization of genes influencing quantitative traits, these tests have not been thoroughly compared with one another. In this report we compare, under a variety of conditions, several different linkage-disequilibrium tests for identification of loci affecting quantitative traits. These tests use either single individuals or parent-child trios. When we compared tests with equal samples, we found that the truncated measured allele (TMA) test was the most powerful. The trait allele frequencies, the stringency of sample ascertainment, the number of marker alleles, and the linked genetic variance affected the power, but the presence of polygenes did not. When there were more than two trait alleles at a locus in the population, power to detect disequilibrium was greatly diminished. The presence of unlinked disequilibrium (D'*) increased the false-positive error rates of disequilibrium tests involving single individuals but did not affect the error rates of tests using family trios. The increase in error rates was affected by the stringency of selection, the trait allele frequency, and the linked genetic variance but not by polygenic factors. In an equilibrium population, the TMA test is most powerful, but, when adjusted for the presence of admixture, Allison test 3 becomes the most powerful whenever D'*>.15.  相似文献   

13.
Recent admixture between genetically differentiated populations can result in high levels of association between alleles at loci that are <=10 cM apart. The transmission/disequilibrium test (TDT) proposed by Spielman et al. (1993) can be a powerful test of linkage between disease and marker loci in the presence of association and therefore could be a useful test of linkage in admixed populations. The degree of association between alleles at two loci depends on the differences in allele frequencies, at the two loci, in the founding populations; therefore, the choice of marker is important. For a multiallelic marker, one strategy that may improve the power of the TDT is to group marker alleles within a locus, on the basis of information about the founding populations and the admixed population, thereby collapsing the marker into one with fewer alleles. We have examined the consequences of collapsing a microsatellite into a two-allele marker, when two founding populations are assumed for the admixed population, and have found that if there is random mating in the admixed population, then typically there is a collapsing for which the power of the TDT is greater than that for the original microsatellite marker. A method is presented for finding the optimal collapsing that has minimal dependence on the disease and that uses estimates either of marker allele frequencies in the two founding populations or of marker allele frequencies in the current, admixed population and in one of the founding populations. Furthermore, this optimal collapsing is not always the collapsing with the largest difference in allele frequencies in the founding populations. To demonstrate this strategy, we considered a recent data set, published previously, that provides frequency estimates for 30 microsatellites in 13 populations.  相似文献   

14.
Characterization of eight VNTR loci by agarose gel electrophoresis   总被引:11,自引:0,他引:11  
Allelic frequencies and their confidence intervals were obtained for eight independent VNTR loci from a sample of more than 75 Utah Caucasians. Using high-resolution agarose gel electrophoresis, we were able to resolve alleles at the D17S5 locus that differed by only one repeating unit; it was therefore possible to name the alleles according to the number of repeating units each contained. Two a priori probabilities were calculated for each VNTR locus separately and for all eight loci jointly: (i) the "power of exclusion" for an alleged father/mother/child trio and for an alleged parent/child duo, and (ii) the "probability of matching" when two unrelated individuals or two siblings are genotyped.  相似文献   

15.
The transmission/disequilibrium test (TDT) and the affected sib pair test (ASP) both test for the association of a marker allele with some conditions. Here, we present methods for calculating the probability of detecting the association (power) for a study examining a fixed number of families for suitability for the study and for calculating the number of such families to be examined. Both calculations use a genetic model for the association. The model considered posits a bi-allelic marker locus that is linked to a bi-allelic disease locus with a possibly nonzero recombination fraction between the loci. The penetrance of the disease is an increasing function of the number of disease alleles. The TDT tests whether the transmission by a heterozygous parent of a particular allele at a marker locus to an affected offspring occurs with probability greater than 0.5. The ASP tests whether transmission of the same allele to two affected sibs occurs with probability greater than 0.5. In either case, evidence that the probability is greater than 0.5 is evidence for association between the marker and the disease. Study inclusion criteria (IC) can greatly affect the necessary sample size of a TDT or ASP study. IC considered by us include a randomly selected parent at least one parent or both parents required to be heterozygous. It also allows a specified minimum number of affected offspring to be required (TDT only). We use elementary probability calculations rather than complex mathematical manipulations or asymptotic methods (large sample size approximations) to compute power and requisite sample size for a proposed study. The advantages of these methods are simplicity and generality.  相似文献   

16.
One way to perform linkage-disequilibrium (LD) mapping of genetic traits is to use single markers. Since dense marker maps-such as single-nucleotide polymorphism and high-resolution microsatellite maps-are available, it is natural and practical to generalize single-marker LD mapping to high-resolution haplotype or multiple-marker LD mapping. This article investigates high-resolution LD-mapping methods, for complex diseases, based on haplotype maps or microsatellite marker maps. The objective is to explore test statistics that combine information from haplotype blocks or multiple markers. Based on two coding methods, genotype coding and haplotype coding, Hotelling's T2 statistics TG and TH are proposed to test the association between a disease locus and two haplotype blocks or two markers. The validity of the two T2 statistics is proved by theoretical calculations. A statistic TC, an extension of the traditional chi2 method of comparing haplotype frequencies, is introduced by simply adding the chi2 test statistics of the two haplotype blocks together. The merit of the three methods is explored by calculation and comparison of power and of type I errors. In the presence of LD between the two blocks, the type I error of TC is higher than that of TH and TG, since TC ignores the correlation between the two blocks. For each of the three statistics, the power of using two haplotype blocks is higher than that of using only one haplotype block. By power comparison, we notice that TC has higher power than that of TH, and TH has higher power than that of TG. In the absence of LD between the two blocks, the power of TC is similar to that of TH and higher than that of TG. Hence, we advocate use of TH in the data analysis. In the presence of LD between the two blocks, TH takes into account the correlation between the two haplotype blocks and has a lower type I error and higher power than TG. Besides, the feasibility of the methods is shown by sample-size calculation.  相似文献   

17.
Yoon SL  Kim DC  Cho SH  Lee SY  Chu IS  Heo J  Leem SH 《BMB reports》2010,43(10):698-703
In this study, we characterized two blocks of minisatellites in the 5' upstream region of the BORIS gene (BORIS-MS1, -MS2). BORIS-MS2 was found to be polymorphic; therefore, this locus could be useful as a marker for DNA fingerprinting. We assessed the association between BORIS-MS2 and breast cancer by a case-control study with 428 controls and 793 breast cancers cases. Rare alleles in the younger group (age, <40) were associated with a statistically significant increased risk of breast cancer (odds ratio, 4.84; 95% confidence interval, 1.06-22.22; and P = 0.026). A statistically significant association between the short rare alleles and cancer was identified in the younger group (8.02; 1.01-63.83; P = 0.021). Kaplan-Meier estimates showed that poor prognosis was associated with patients who contained the rare alleles. Our data suggest that the short rare alleles of BORIS-MS2 could be used to identify the risk for breast cancer in young patients.  相似文献   

18.
The association of some diseases with specific alleles of certain genetic markers has been difficult to explain. Several explanations have been proposed for the phenomenon of association, e.g. the existence of multiple, interacting genes (epistasis) or a disease locus in linkage disequilibrium with the marker locus. One might suppose that when marker data from families with associated diseases are analyzed for linkage, the existence of the association would assure that linkage will be found, and found at a tight recombination fraction. In fact, however, linkage analyses of some diseases associated with HLA, as well as diseases associated with alleles at other loci located throughout the genome, show significant evidence against linkage, and others show loose linkage, to the puzzlement of many researchers. In part, the puzzlement arises because linkage analysis is ideal for looking for loci that are necessary, even if not sufficient, for disease expression but may be much less useful for finding loci that are neither necessary nor sufficient for disease expression (so-called susceptibility loci). This work explores what happens when one looks for linkage to susceptibility loci. A susceptibility locus in this case means that the allele increases risk but is neither necessary nor sufficient for disease expression. It might be either an allele at the marker locus itself that is increasing susceptibility or an allele at a locus in linkage disequilibrium with the marker. This work uses computer simulation to examine how linkage analyses behave when confronted with data from such a model.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

19.
The affected-pedigree-member (APM) method of linkage analysis is a nonparametric statistic that tests for nonrandom cosegregation of a disease and marker loci. The APM statistic is based on the observation that if a marker locus is near a disease-susceptibility locus, then affected individuals within a family should be more similar at the marker locus than is expected by chance. The APM statistic measures marker similarity in terms of identity by state (IBS) of marker alleles; that is, two alleles are IBS if they are the same, regardless of their ancestral origin. Since the APM statistic measures increased marker similarity, it makes no assumptions concerning how the disease is inherited; this can be an advantage when dealing with complex diseases for which the mode of inheritance is difficult to determine. We investigate here the power of the APM statistic to detect linkage in the context of a genomewide search. In such a search, the APM statistic is evaluated at a grid of markers. Then regions with high APM statistics are investigated more thoroughly by typing more markers in the region. Using simulated data, we investigate various search strategies and recommend an optimal search strategy that maximizes the power to detect linkage while minimizing the false-positive rate and number of markers. We determine an optimal series of three increasing cut-points and an independent criterion for significance.  相似文献   

20.
Fan R  Jung J 《Human heredity》2002,54(3):132-150
In this paper, we extend association study methods of both Fan et al. [Hum Hered 2002;53:130-145], in which a quantitative trait locus (QTL) and a multi-allele marker are considered for trio families, and Fan and Xiong [Biostatistics 2003, in press], in which a QTL and a bi-allelic marker are considered for nuclear families. The objective is to build mixed models for association study between a QTL and a multi-allelic marker for nuclear families with any number of offspring. Two types of nuclear family data are considered: the first is genetic data of offspring from at least one heterozygous parents, and the second is genetic data of offspring of nuclear family. (1) For the data of offspring from at least one heterozygous parents, we assume that at least one parent is heterozygous at the marker locus, and we may infer clearly the transmission of parental marker alleles to the offspring. We show that it can be used in association study in the presence of linkage. The theoretical basis is the difference between the conditional mean of trait value given an allele is transmitted and the conditional mean of trait value given the allele is not transmitted from a heterozygous parent. To build valid models, we calculate the variance covariance structure of trait values of offspring. Besides, the reduction of the number of parameters is discussed under an assumption of tight linkage between the trait locus and the marker. (2) For the data of offspring of nuclear family, we show that it can be used in general association study. In this case, the theoretical basis is the difference between the conditional mean of trait values given an allele is transmitted from a parent and the population mean. Then, we calculate variance-covariance structure of trait values of offspring. (3) Based on the theoretical analysis, mixed models are built for each type of the data, and related test statistics are proposed for association study. By power calculation and comparison, we show that, in some instances, the proposed test statistics have higher power than that by collapsing alleles to be new ones. The proposed models are used to analyze chromosomes 4 and chromosome 16 data of the Oxford asthma data, Genetic Analysis Workshop 12.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号