共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
Bias toward the null hypothesis in model-free linkage analysis is highly dependent on the test statistic used
下载免费PDF全文

Cordell HJ 《American journal of human genetics》2004,74(6):1294-1302
Recently, it has been suggested that traditional nonparametric multipoint-linkage procedures can show a "bias" toward the null hypothesis of no effect when there is incomplete information about allele sharing at genotyped marker loci (or at positions in between marker loci). Here, I investigate the extent of this bias for a variety of test statistics commonly used in qualitative- ("affecteds only") and quantitative-trait linkage analysis. Through simulation and analytical derivation, I show that many of the test statistics available in standard linkage analysis packages (such as Genehunter, Merlin, and Allegro) are, in fact, not affected by this bias problem. A few test statistics--most notably the nonparametric linkage statistic and, to a lesser extent, the Aspex-MLS and Haseman-Elston statistics--are affected by the bias. Variance-components procedures, although unbiased, can show inflation or deflation of the test statistic attributable to the inclusion of pairs with incomplete identity-by-descent information. Results obtained--for instance, in genome scans--using these methods might therefore be worth revisiting to see if greater power can be obtained by use of an alternative statistic or by eliminating or downweighting uninformative relative pairs. 相似文献
4.
Parametric and nonparametric multipoint linkage analysis with imprinting and two-locus-trait models: application to mite sensitization
下载免费PDF全文

Strauch K Fimmers R Kurz T Deichmann KA Wienker TF Baur MP 《American journal of human genetics》2000,66(6):1945-1957
We present two extensions to linkage analysis for genetically complex traits. The first extension allows investigators to perform parametric (LOD-score) analysis of traits caused by imprinted genes-that is, of traits showing a parent-of-origin effect. By specification of two heterozygote penetrance parameters, paternal and maternal origin of the mutation can be treated differently in terms of probability of expression of the trait. Therefore, a single-disease-locus-imprinting model includes four penetrances instead of only three. In the second extension, parametric and nonparametric linkage analysis with two trait loci is formulated for a multimarker setting, optionally taking imprinting into account. We have implemented both methods into the program GENEHUNTER. The new tools, GENEHUNTER-IMPRINTING and GENEHUNTER-TWOLOCUS, were applied to human family data for sensitization to mite allergens. The data set comprises pedigrees from England, Germany, Italy, and Portugal. With single-disease-locus-imprinting MOD-score analysis, we find several regions that show at least suggestive evidence for linkage. Most prominently, a maximum LOD score of 4.76 is obtained near D8S511, for the English population, when a model that implies complete maternal imprinting is used. Parametric two-trait-locus analysis yields a maximum LOD score of 6.09 for the German population, occurring exactly at D4S430 and D18S452. The heterogeneity model specified for analysis alludes to complete maternal imprinting at both disease loci. Altogether, our results suggest that the two novel formulations of linkage analysis provide valuable tools for genetic mapping of multifactorial traits. 相似文献
5.
Most linkage programs assume linkage equilibrium among multiple linked markers. This assumption may lead to bias for tightly linked markers where strong linkage disequilibrium (LD) exists. We used simulated data from Genetic Analysis Workshop 14 to examine the possible effect of LD on multipoint linkage analysis. Single-nucleotide polymorphism packets from a non-disease-related region that was generated with LD were used for both model-free and parametric linkage analyses. Results showed that high LD among markers can induce false-positive evidence of linkage for affected sib-pair analysis when parental data are missing. Bias can be eliminated with parental data and can be reduced when additional markers not in LD are included in the analyses. 相似文献
6.
Model misspecification and multipoint linkage analysis. 总被引:9,自引:0,他引:9
Pairwise linkage analysis is robust to genetic model misspecification provided dominance is correctly specified, the primary effect being inflation of the recombination fraction. By contrast, we show that multipoint analysis under misspecified models is not robust when a putative disease locus is placed between close flanking markers, with potentially spuriously negative multipoint lod scores being produced. The problem is due to incorrect attribution of segregation of a disease allele and the consequent conclusion of (unlikely) double crossovers between flanking markers. As a possible solution, we propose the use of high disease allele frequencies, as this allows probabilistically for nonsegregation (through parental homozygosity or dual matings). We show analytically and through analysis of pedigree data simulated under a two-locus heterogeneity model that using a disease allele frequency of 0.05 in the dominant case and 0.25 in the recessive case is quite robust in producing positive multipoint lod scores with close flanking markers across a broad range of conditions including varying allele frequencies, epistasis, genetic heterogeneity and phenocopies. 相似文献
7.
8.
Eeva-Marja Sankila Thomas Lehner Aldur W. Eriksson Henrik Forsius Jussi Kärnä David Page Jürg Ott Albert de la Chapelle 《Human genetics》1989,84(1):66-70
Summary Multipoint linkage analysis of choroideremia (TCD) and seven X chromosomal restriction fragment length polymorphisms (RFLPs) was carried out in 18 Finnish TCD families. The data place TCD distal to PGK and DXS72, very close to DXYS1 and DXYS5 (Zmax = 24 at = 0) and proximal to DXYS4 and DXYS12. This agrees with the data obtained from other linkage studies and from physical mapping. All the TCD males and carrier females studied have the same DXYS1 allele in coupling with TCD. In Northeastern Finland, 66/69 chromosomes carrying TCD had the same haplotype at loci DXS72, DXYS1, DXYS4, and DXYS12. The same haplotype is seen in only 15/99 chromosomes not carrying TCD. Moreover, in 71/104 non-TCD chromosomes, the haplotype at six marker loci is different from those seen in any of the 76 TCD chromosomes. This supports the previously described hypothesis that the large Northern Finnish choroideremia pedigrees, comprising a total of over 80 living patients representing more than a fifth of all TCD patients described worldwide, carry the same mutation. These linkage and haplotype data provide improved opportunities for prenatal diagnosis based on RFLP studies. 相似文献
9.
10.
In this article we deal with two-locus nonparametric linkage (NPL) analysis, mainly in the context of conditional analysis. This means that one incorporates single-locus analysis information through conditioning when performing a two-locus analysis. Here we describe different strategies for using this approach. Cox et al. [Nat Genet 1999;21:213-215] implemented this as follows: (i) Calculate the one-locus NPL process over the included genome region(s). (ii) Weight the individual pedigree NPL scores using a weighting function depending on the NPL scores for the corresponding pedigrees at speci fi c conditioning loci. We generalize this by conditioning with respect to the inheritance vector rather than the NPL score and by separating between the case of known (prede fi ned) and unknown (estimated) conditioning loci. In the latter case we choose conditioning locus, or loci, according to prede fi ned criteria. The most general approach results in a random number of selected loci, depending on the results from the previous one-locus analysis. Major topics in this article include discussions on optimal score functions with respect to the noncentrality parameter (NCP), and how to calculate adequate p values and perform power calculations. We also discuss issues related to multiple tests which arise from the two-step procedure with several conditioning loci as well as from the genome-wide tests. 相似文献
11.
O'Connell JR 《Human heredity》2001,51(4):226-240
The calculation of multipoint likelihoods of pedigree data is crucial for extracting the full available information needed for both parametric and nonparametric linkage analysis. Recent mathematical advances in both the Elston-Stewart and Lander-Green algorithms for computing exact multipoint likelihoods of pedigree data have enabled researchers to analyze data sets containing more markers and more individuals both faster and more efficiently. This paper presents novel algorithms that further extend the computational boundary of the Elston-Stewart algorithm. They have been implemented into the software package VITESSE v. 2 and are shown to be several orders of magnitude faster than the original implementation of the Elston-Stewart algorithm in VITESSE v. 1 on a variety of real pedigree data. VITESSE v. 2 was faster by a factor ranging from 168 to over 1,700 on these data sets, thus making a qualitative difference in the analysis. The main algorithm is based on the faster computation of the conditional probability of a component nuclear family within the pedigree by summing over the joint genotypes of the children instead of the parents as done in the VITESSE v. 1. This change in summation allows the parent-child transmission part of the calculation to be not only computed for each parent separately, but also for each locus separately by using inheritance vectors as is done in the Lander-Green algorithm. Computing both of these separately can lead to substantial computational savings. The use of inheritance vectors in the nuclear family calculation represents a partial synthesis of the techniques of the Lander-Green algorithm into the Elston-Stewart algorithm. In addition, the technique of local set recoding is introduced to further reduce the complexity of the nuclear family computation. These new algorithms, however, are not universally faster on all types of pedigree data compared to the method implemented in VITESSE v. 1 of summing over the parents. Therefore, a hybrid algorithm is introduced which combines the strength of both summation methods by using a numerical heuristic to decide which of the two to use for a given nuclear family within the pedigree and is shown to be faster than either method on its own. Finally, this paper discusses various complexity issues regarding both the Elston-Stewart and Lander-Green algorithms and possible future directions of further synthesis. 相似文献
12.
We have compared the efficiency of the lod score test which assumes heterogeneity (lod2) to the standard lod score test which assumes homogeneity (lod1) when three-point linkage analysis is used in successive map intervals. If it is assumed that a gene located midway between two linked marker loci is responsible for a proportion of disease cases, then the lod1 test loses power relative to the lod2 test, as the proportion of linked families decreases, as the flanking markers are more closely linked, and as more map intervals are tested. Moreover, when multipoint analysis is used, linkage for a disease gene is more likely to be incorrectly excluded from a complete and dense linkage map if true genetic heterogeneity is ignored. We thus conclude that, in general, the lod2 linkage test is more efficient for detecting a true linkage when a complete genetic marker map is screened for a heterogeneous disorder. 相似文献
13.
Lemire M 《BMC genetics》2005,6(Z1):S159
A simple multipoint procedure to test for parent-of-origin effects in samples of affected siblings is discussed. The procedure consists of artificially changing all full sibs to half-sibs, with distinct mothers or fathers depending on the parental origin to be evaluated, then analyzing these families with commonly used statistics and software. The procedure leads to tests for linkage through mothers or fathers and also leads to a test for imprinting effects in the presence of linkage. Moreover, simulations illustrate that in regions unlinked to susceptibility genes this multipoint procedure does not have an inflated type I error if a sex-averaged genetic map is used, even when large differences exist between male-specific and female-specific maps. In regions linked with susceptibility genes, the test of imprinting is biased under the null hypothesis if differences exist between sex-specific maps, irrespective of the map used in the analysis. The procedure is applied to the Collaborative Study on the Genetics of Alcoholism dataset from the Genetic Analysis Workshop 14. Results indicate that brothers categorized as affected according to the DMS-III-R and Feighner classification show evidence of linkage through fathers to the 6q25 region (p = 0.00038) as well as modest evidence of imprinting (p = 0.018). This region harbors OPRM1, a candidate gene for substance dependence. 相似文献
14.
Methods based on variance components are powerful tools for linkage analysis of quantitative traits, because they allow simultaneous consideration of all pedigree members. The central idea is to identify loci making a significant contribution to the population variance of a trait, by use of allele-sharing probabilities derived from genotyped marker loci. The technique is only as powerful as the methods used to infer these probabilities, but, to date, no implementation has made full use of the inheritance information in mapping data. Here we present a new implementation that uses an exact multipoint algorithm to extract the full probability distribution of allele sharing at every point in a mapped region. At each locus in the region, the program fits a model that partitions total phenotypic variance into components due to environmental factors, a major gene at the locus, and other unlinked genes. Numerical methods are used to derive maximum-likelihood estimates of the variance components, under the assumption of multivariate normality. A likelihood-ratio test is then applied to detect any significant effect of the hypothesized major gene. Simulations show the method to have greater power than does traditional sib-pair analysis. The method is freely available in a new release of the software package GENEHUNTER. 相似文献
15.
Efficient multipoint linkage analysis through reduction of inheritance space 总被引:14,自引:0,他引:14
下载免费PDF全文

Computational constraints currently limit exact multipoint linkage analysis to pedigrees of moderate size. We introduce new algorithms that allow analysis of larger pedigrees by reducing the time and memory requirements of the computation. We use the observed pedigree genotypes to reduce the number of inheritance patterns that need to be considered. The algorithms are implemented in a new version (version 2.1) of the software package GENEHUNTER. Performance gains depend on marker heterozygosity and on the number of pedigree members available for genotyping, but typically are 10-1,000-fold, compared with the performance of the previous release (version 2.0). As a result, families with up to 30 bits of inheritance information have been analyzed, and further increases in family size are feasible. In addition to computation of linkage statistics and haplotype determination, GENEHUNTER can also perform single-locus and multilocus transmission/disequilibrium tests. We describe and implement a set of permutation tests that allow determination of empirical significance levels in the presence of linkage disequilibrium among marker loci. 相似文献
16.
Nonparametric linkage analysis is widely used to map susceptibility genes for complex diseases. This paper introduces six nonparametric statistics for measuring marker allele sharing among the affected members of a pedigree. We compare the power of these new statistics and three previous statistics to detect linkage with Mendelian diseases having recessive, additive, and dominant modes of inheritance. The nine statistics represent all possible combinations of three different IBD scoring functions and three different schemes for sampling genes among affecteds. Our results strongly suggest that the statistic T(rec)(blocks) is best for recessive traits, while the two statistics T(kin)(pairs) and T(all)(kin) vie for best for an additive trait. The best statistic for a dominant trait is less clear. The statistics T(kin)(pairs) and T(all)(kin) are equally promising for small sibships, but in extended pedigrees the statistics T(dom)(blocks) and T(dom)(pairs) appear best. For a complex trait, we advocate computing several of these statistics. 相似文献
17.
Comparison of a multipoint identity-by-descent method with parametric multipoint linkage analysis for mapping quantitative traits.
下载免费PDF全文

We previously developed a method of partitioning genetic variance of a quantitative trait to loci in specific chromosomal regions. In this paper, we compare this method--multipoint IBD (identical by descent) method (MIM)--with parametric multipoint linkage analysis (MLINK). A simulation study was performed comparing the methods for the major-locus, mixed, and two-locus models. The criterion for comparisons between MIM and MLINK was the average lod score from multiple replicates of simulated data sets. The effect of gene frequency, dominance, model misspecification, marker spacing, and informativeness are also considered in a smaller set of simulations. Within the context of the models examined, the MIM approach was found to be comparable in power with parametric multipoint linkage analysis when (a) parental data are unknown, (b) the effect of the major locus is small and there is additional genetic variation, or (c) the parameters of the major-locus model are misspecified. The performance of the MIM method relative to MLINK was markedly lower when the allele frequency at the trait locus was .2 versus .5, particularly for the case when parental data were assumed to be known. Dominance at the trait major locus, as well as marker spacing and heterozygosity, did not appear to have a large effect on the ELOD comparisons. 相似文献
18.
We have compared the power of a large number of allele-sharing statistics for "nonparametric" linkage analysis with affected sibships. Our rationale was that there is an extensive literature comparing statistics for sibling pairs but that there has not been much guidance on how to choose statistics for studies that include sibships of various sizes. We concentrated on statistics that can be described as assigning scores to each identity-by-descent-sharing configuration that a pedigree might take on (Whittemore and Halpern 1994). We considered sibships of sizes two through five, 27 different genetic models, and varying recombination fractions between the marker and the trait locus. We tried to identify statistics whose power was robust over a wide variety of models. We found that the statistic that is probably used most often in such studies-S(all)-performs quite well, although it is not necessarily the best. We also found several other statistics (such as the R criterion, S(robdom), and the Sobel-and-Lange statistic C) that perform well in most situations, a few (such as S(-#geno) and the Feingold-and-Siegmund version of S(pairs)) that have high power only in very special situations, and a few (such as S(-#geno), the N criterion, and the Sobel-and-Lange statistic B) that seem to have low power for the majority of the trait models. For the most part, the same statistics performed well for all sibship sizes. We also used our results to give some suggestions regarding how to weight sibships of different sizes, in forming an overall statistic. 相似文献
19.
20.
Summary Spinocerebellar ataxia (SCA) was studied in a seven-generation (Schut-Swier) kindred using linkage analysis to localize further the autosomal dominant, HLA-linked, disease-producing SCA1 locus relative to four other loci that map to the short arm of human chromosome 6. Genotypes for each locus were determined in as many individuals as possible from a total of 162 affected and unaffected family members that were studied. A maximum pairwise lod score of 8.52 (
m = 0.10,
f = 0.22) for linkage between SCA1 and HLA-A was observed. Multipoint linkage analyses for the SCA1, HLA-A, F13A, D6S7, and GLO1 loci revealed that the SCA1 locus is most probably located telomeric to HLA-A, with a likely location between HLA-A and F13A. 相似文献