首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
The affected-pedigree-member (APM) method of linkage analysis is designed to detect departures from independent segregation of disease and marker phenotypes. The underlying statistic of the APM method operates on the identity-by-state relations implied by the marker phenotypes of the affected within a pedigree. Here we generalize the APM statistic to multiple linked markers. This generalization relies on recursive computation of two-locus kinship coefficients by an algorithm of Thompson. The distributional properties of the extended APM statistic are investigated theoretically and by simulation in the context of one real and one artificial data set. In both examples, the multilocus statistic tends to reject, more strongly than the single-locus statistics do, the null hypothesis of independent segregation between the disease locus and the marker loci.  相似文献   

2.
There is increasing interest in the basis of commonly observed heterozygosity-fitness correlations (HFCs). Two models appear possible, a genome-wide effect due to inbreeding depression, and a single-locus effect due to chance linkage to a gene(s) experiencing balancing selection. Recent studies suggest that the latter tends to be more important in the majority of studies, but tests for the presence of single-locus effects tend to be rather weak. One of the problems is that the linkage disequilibrium between a microsatellite and a nearby gene experiencing balancing selection is never likely to be 100%. With this in mind, we conduct stochastic simulations aimed at determining the conditions under which single-locus HFCs may develop. We also suggest a new approach that could offer improved detection of HFCs but which also offers a more general method for detecting genotype-fitness correlations. Our method is based on looking for the maximum possible strength of association between genotype and fitness, and then asking whether randomized data sets are able to generate similarly strong associations. This method is tested on both simulated and real data. In both cases, our method generates greater levels of significance than current tests. Applied to previously published data from wild boar affected by tuberculosis, the method uncovers a strong single-allele association that is strongly predictive of whether the disease is localized or spreads throughout the body. We further suggest a simple method for dealing with the problem of population structure, and believe this approach will help to identify genomic regions associated with fitness.  相似文献   

3.
Genome-wide linkage analysis using microsatellite markers has been successful in the identification of numerous Mendelian and complex disease loci. The recent availability of high-density single-nucleotide polymorphism (SNP) maps provides a potentially more powerful option. Using the simulated and Collaborative Study on the Genetics of Alcoholism (COGA) datasets from the Genetics Analysis Workshop 14 (GAW14), we examined how altering the density of SNP marker sets impacted the overall information content, the power to detect trait loci, and the number of false positive results. For the simulated data we used SNP maps with density of 0.3 cM, 1 cM, 2 cM, and 3 cM. For the COGA data we combined the marker sets from Illumina and Affymetrix to create a map with average density of 0.25 cM and then, using a sub-sample of these markers, created maps with density of 0.3 cM, 0.6 cM, 1 cM, 2 cM, and 3 cM. For each marker set, multipoint linkage analysis using MERLIN was performed for both dominant and recessive traits derived from marker loci. Our results showed that information content increased with increased map density. For the homogeneous, completely penetrant traits we created, there was only a modest difference in ability to detect trait loci. Additionally, as map density increased there was only a slight increase in the number of false positive results when there was linkage disequilibrium (LD) between markers. The presence of LD between markers may have led to an increased number of false positive regions but no clear relationship between regions of high LD and locations of false positive linkage signals was observed.  相似文献   

4.
Quantitative trait loci (QTL) affecting the phenotype of interest can be detected using linkage analysis (LA), linkage disequilibrium (LD) mapping or a combination of both (LDLA). The LA approach uses information from recombination events within the observed pedigree and LD mapping from the historical recombinations within the unobserved pedigree. We propose the Bayesian variable selection approach for combined LDLA analysis for single-nucleotide polymorphism (SNP) data. The novel approach uses both sources of information simultaneously as is commonly done in plant and animal genetics, but it makes fewer assumptions about population demography than previous LDLA methods. This differs from approaches in human genetics, where LDLA methods use LA information conditional on LD information or the other way round. We argue that the multilocus LDLA model is more powerful for the detection of phenotype–genotype associations than single-locus LDLA analysis. To illustrate the performance of the Bayesian multilocus LDLA method, we analyzed simulation replicates based on real SNP genotype data from small three-generational CEPH families and compared the results with commonly used quantitative transmission disequilibrium test (QTDT). This paper is intended to be conceptual in the sense that it is not meant to be a practical method for analyzing high-density SNP data, which is more common. Our aim was to test whether this approach can function in principle.  相似文献   

5.
Tan YD  Fu YX 《Genetics》2007,175(2):923-931
Although most high-density linkage maps have been constructed from codominant markers such as single-nucleotide polymorphisms (SNPs) and microsatellites due to their high linkage information, dominant markers can be expected to be even more significant as proteomic technique becomes widely applicable to generate protein polymorphism data from large samples. However, for dominant markers, two possible linkage phases between a pair of markers complicate the estimation of recombination fractions between markers and consequently the construction of linkage maps. The low linkage information of the repulsion phase and high linkage information of coupling phase have led geneticists to construct two separate but related linkage maps. To circumvent this problem, we proposed a new method for estimating the recombination fraction between markers, which greatly improves the accuracy of estimation through distinction between the coupling phase and the repulsion phase of the linked loci. The results obtained from both real and simulated F2 dominant marker data indicate that the recombination fractions estimated by the new method contain a large amount of linkage information for constructing a complete linkage map. In addition, the new method is also applicable to data with mixed types of markers (dominant and codominant) with unknown linkage phase.  相似文献   

6.
Yang Y  Ott J 《Human heredity》2002,53(4):227-236
In genome-wide screens of genetic marker loci, non-mendelian inheritance of a marker is taken to indicate its vicinity to a disease locus. Heritable complex traits are thought to be under the influence of multiple possibly interacting susceptibility loci yet the most frequently used methods of linkage and association analysis focus on one susceptibility locus at a time. Here we introduce log-linear models for the joint analysis of multiple marker loci and interaction effects between them. Our approach focuses on affected sib pair data and identical by descent (IBD) allele sharing values observed on them. For each heterozygous parent, the IBD values at linked markers represent a sequence of dependent binary variables. We develop log-linear models for the joint distribution of these IBD values. An independence log-linear model is proposed to model the marginal means and the neighboring interaction model is advocated to account for associations between adjacent markers. Under the assumption of conditional independence, likelihood methods are applied to simulated data containing one or two susceptibility loci. It is shown that the neighboring interaction log-linear model is more efficient than the independence model, and incorporating interaction in the two-locus analysis provides increased power and accuracy for mapping of the trait loci.  相似文献   

7.
This paper presents a method of performing model-free LOD-score based linkage analysis on quantitative traits. It is implemented in the QMFLINK program. The method is used to perform a genome screen on the Framingham Heart Study data. A number of markers that show some support for linkage in our study coincide substantially with those implicated in other linkage studies of hypertension. Although the new method needs further testing on additional real and simulated data sets we can already say that it is straightforward to apply and may offer a useful complementary approach to previously available methods for the linkage analysis of quantitative traits.  相似文献   

8.
Holmans P 《Human heredity》2002,53(2):92-102
Interest has recently focussed on allowing for interactions between loci as a way to increase power to detect linkage. In this paper, a simplified logistic regression method was used to perform affected sib pair analyses allowing for the inclusion of data from other loci. A systematic search of two-locus disease models was carried out to determine the situations in which this was advantageous. If IBD information is available (e.g. from a genome scan), it is unlikely that allowing for interactions will give a large lod score in the absence of linkage evidence from sinlge-locus analysis. Furthermore, allowing for interactions rarely gave a significant increase in power to detect linkage over a single-locus analysis, except for heterogeneity models with low K(P). Conversely, the availability of disease-associated genotypes may greatly increase the power both to detect linkage to a second locus and interaction between the loci. These results indicate that when only IBD information is available, two-locus analysis of genome scan data should be restricted to regions giving peaks under single-locus analysis. If disease-associated genotypes are available, it may be worth re-analysing the whole genome.  相似文献   

9.
Kim S  Zhang K  Sun F 《BMC genetics》2003,4(Z1):S9
Complex diseases are generally caused by intricate interactions of multiple genes and environmental factors. Most available linkage and association methods are developed to identify individual susceptibility genes assuming a simple disease model blind to any possible gene - gene and gene - environmental interactions. We used a set association method that uses single-nucleotide polymorphism markers to locate genetic variation responsible for complex diseases in which multiple genes are involved. Here we extended the set association method from bi-allelic to multiallelic markers. In addition, we studied the type I error rates and power for both approaches using simulations based on the coalescent process. Both bi-allelic set association (BSA) and multiallelic set association (MSA) tests have the correct type I error rates. In addition, BSA and MSA can have more power than individual marker analysis when multiple genes are involved in a complex disease. We applied the MSA approach to the simulated data sets from Genetic Analysis Workshop 13. High cholesterol level was used as the definitive phenotype for a disease. MSA failed to detect markers with significant linkage disequilibrium with genes responsible for cholesterol level. This is due to the wide spacing between the markers and the lack of association between the marker loci and the simulated phenotype.  相似文献   

10.
A complete enumeration and classification of two-locus disease models   总被引:7,自引:0,他引:7  
Li W  Reich J 《Human heredity》2000,50(6):334-349
There are 512 two-locus, two-allele, two-phenotype, fully penetrant disease models. Using the permutation between two alleles, between two loci, and between being affected and unaffected, one model can be considered to be equivalent to another model under the corresponding permutation. These permutations greatly reduce the number of two-locus models in the analysis of complex diseases. This paper determines the number of nonredundant two-locus models (which can be 102, 100, 96, 51, 50, or 58, depending on which permutations are used, and depending on whether zero-locus and single-locus models are excluded). Whenever possible, these nonredundant two-locus models are classified by their property. Besides the familiar features of multiplicative models (logical AND), heterogeneity models (logical OR), and threshold models, new classifications are added or expanded: modifying-effect models, logical XOR models, interference and negative interference models (neither dominant nor recessive), conditionally dominant/recessive models, missing lethal genotype models, and highly symmetric models. The following aspects of two-locus models are studied: the marginal penetrance tables at both loci, the expected joint identity-by-descent (IBD) probabilities, and the correlation between marginal IBD probabilities at the two loci. These studies are useful for linkage analyses using single-locus models while the underlying disease model is two-locus, and for correlation analyses using the linkage signals at different locations obtained by a single-locus model.  相似文献   

11.
In case-control studies, genetic associations for complex diseases may be probed either with single-locus tests or with haplotype-based tests. Although there are different views on the relative merits and preferences of the two test strategies, haplotype-based analyses are generally believed to be more powerful to detect genes with modest effects. However, a main drawback of haplotype-based association tests is the large number of distinct haplotypes, which increases the degrees of freedom for corresponding test statistics and thus reduces the statistical power. To decrease the degrees of freedom and enhance the efficiency and power of haplotype analysis, we propose an improved haplotype clustering method that is based on the haplotype cladistic analysis developed by Durrant et al. In our method, we attempt to combine the strengths of single-locus analysis and haplotype-based analysis into one single test framework. Novel in our method is that we develop a more informative haplotype similarity measurement by using p-values obtained from single-locus association tests to construct a measure of weight, which to some extent incorporates the information of disease outcomes. The weights are then used in computation of similarity measures to construct distance metrics between haplotype pairs in haplotype cladistic analysis. To assess our proposed new method, we performed simulation analyses to compare the relative performances of (1) conventional haplotype-based analysis using original haplotype, (2) single-locus allele-based analysis, (3) original haplotype cladistic analysis (CLADHC) by Durrant et al., and (4) our weighted haplotype cladistic analysis method, under different scenarios. Our weighted cladistic analysis method shows an increased statistical power and robustness, compared with the methods of haplotype cladistic analysis, single-locus test, and the traditional haplotype-based analyses. The real data analyses also show that our proposed method has practical significance in the human genetics field.  相似文献   

12.
A. Ruiz  A. Barbadilla 《Genetics》1995,139(1):445-455
Using Cockerham's approach of orthogonal scales, we develop genetic models for the effect of an arbitrary number of multiallelic quantitative trait loci (QTLs) or neutral marker loci (NMLs) upon any number of quantitative traits. These models allow the unbiased estimation of the contributions of a set of marker loci to the additive and dominance variances and covariances among traits in a random mating population. The method has been applied to an analysis of allozyme and quantitative data from the European oyster. The contribution of a set of marker loci may either be real, when the markers are actually QTLs, or apparent, when they are NMLs that are in linkage disequilibrium with hidden QTLs. Our results show that the additive and dominance variances contributed by a set of NMLs are always minimum estimates of the corresponding variances contributed by the associated QTLs. In contrast, the apparent contribution of the NMLs to the additive and dominance covariances between two traits may be larger than, equal to or lower than the actual contributions of the QTLs. We also derive an expression for the expected variance explained by the correlation between a quantitative trait and multilocus heterozygosity. This correlation explains only a part of the genetic variance contributed by the markers, i.e., in general, a combination of additive and dominance variances and, thus, provides only very limited information relative to the method supplied here.  相似文献   

13.
We present a maximum likelihood method for mapping quantitative trait loci that uses linkage disequilibrium information from single and multiple markers. We made paired comparisons between analyses using a single marker, two markers and six markers. We also compared the method to single marker regression analysis under several scenarios using simulated data. In general, our method outperformed regression (smaller mean square error and confidence intervals of location estimate) for quantitative trait loci with dominance effects. In addition, the method provides estimates of the frequency and additive and dominance effects of the quantitative trait locus.  相似文献   

14.
There is growing evidence that a map of dense single-nucleotide polymorphisms (SNPs) can outperform a map of sparse microsatellites for linkage analysis. There is also argument as to whether a clustered SNP map can outperform an evenly spaced SNP map. Using Genetic Analysis Workshop 14 simulated data, we compared for linkage analysis microsatellites, SNPs, and composite markers derived from SNPs. We encoded the composite markers in a two-step approach, in which the maximum identity length contrast method was employed to allow for recombination between loci. A SNP map 2.3 times as dense as a microsatellite map (approximately 2.9 cM compared to approximately 6.7 cM apart) provided slightly less information content (approximately 0.83 compared to approximately 0.89). Most inheritance information could be extracted when the SNPs were spaced < 1 cM apart. Comparing the linkage results on using SNPs or composite markers derived from them based on both 3 cM and 0.3 cM resolution maps, we showed that the inter-SNP distance should be kept small (< 1 cM), and that for multipoint linkage analysis the original markers and the derived composite markers had similar power; but for single point linkage analysis the resulting composite markers lead to more power. Considering all factors, such as information content, flexibility of analysis method, map errors, and genotyping errors, a map of clustered SNPs can be an efficient design for a genome-wide linkage scan.  相似文献   

15.
Association studies offer an exciting approach to finding underlying genetic variants of complex human diseases. However, identification of genetic variants still includes difficult challenges, and it is important to develop powerful new statistical methods. Currently, association methods may depend on single-locus analysis--that is, analysis of the association of one locus, which is typically a single-nucleotide polymorphism (SNP), at a time--or on multilocus analysis, in which multiple SNPs are used to allow extraction of maximum information about linkage disequilibrium (LD). It has been shown that single-locus analysis may have low power because a single SNP often has limited LD information. Multilocus analysis, which is more informative, can be performed on the basis of either haplotypes or genotypes. It may lose power because of the often large number of degrees of freedom involved. The ideal method must make full use of important information from multiple loci but avoid increasing the degrees of freedom. Therefore, we propose a method to capture information from multiple SNPs but with the use of fewer degrees of freedom. When a set of SNPs in a block are correlated because of LD, we might expect that the genotype variation among the different phenotypic groups would extend across all the SNPs, and this information could be compressed into the low-frequency components of a Fourier transform. Therefore, we develop a test based on weighted Fourier transformation coefficients, with more weight given to the low-frequency components. Our simulation results demonstrate the validity and substantially higher power of the proposed method compared with other common methods. This method provides an additional tool to existing methods for identification of causative genetic variants underlying complex diseases.  相似文献   

16.
Reconstruction of sibling relationships from genetic data is an important component of many biological applications. In particular, the growing application of molecular markers (microsatellites) to study wild populations of plant and animals has created the need for new computational methods of establishing pedigree relationships, such as sibgroups, among individuals in these populations. Most current methods for sibship reconstruction from microsatellite data use statistical and heuristic techniques that rely on a priori knowledge about various parameter distributions. Moreover, these methods are designed for data with large number of sampled loci and small family groups, both of which typically do not hold for wild populations. We present a deterministic technique that parsimoniously reconstructs sibling groups using only Mendelian laws of inheritance. We validate our approach using both simulated and real biological data and compare it to other methods. Our method is highly accurate on real data and compares favorably with other methods on simulated data with few loci and large family groups. It is the only method that does not rely on a priori knowledge about the population under study. Thus, our method is particularly appropriate for reconstructing sibling groups in wild populations.  相似文献   

17.
We derive a multivariate survival model for age of onset data of a sibship from an additive genetic gamma frailty model constructed basing on the inheritance vectors, and investigate the properties of this model. Based on this model, we propose a retrospective likelihood approach for genetic linkage analysis using sibship data. This test is an allele-sharing-based test, and does not require specification of genetic models or the penetrance functions. This new approach can incorporate both affected and unaffected sibs, environmental covariates and age of onset or age at censoring information and, therefore, provides a practical solution for mapping genes for complex diseases with variable age of onset. Small simulation study indicates that the proposed method performs better than the commonly used allele-sharing-based methods for linkage analysis, especially when the population disease rate is high. We applied this method to a type 1 diabetes sib pair data set and a small breast cancer data set. Both simulated and real data sets also indicate that the method is relatively robust to the misspecification to the baseline hazard function.  相似文献   

18.
Detection of tandem duplications and implications for linkage analysis.   总被引:1,自引:1,他引:0  
The first demonstration of an autosomal dominant human disease caused by segmental trisomy came in 1991 for Charcot-Marie-Tooth disease type 1A (CMT1A). For this disorder, the segmental trisomy is due to a large tandem duplication of 1.5 Mb of DNA located on chromosome 17p11.2-p12. The search for the CMT1A disease gene was misdirected and impeded because some chromosome 17 genetic markers that are linked to CMT1A lie within this duplication. To better understand how such a duplication might affect genetic analyses in the context of disease gene mapping, we studied the effects of marker duplication on transmission probabilities of marker alleles, on linkage analysis of an autosomal dominant disease, and on tests of linkage homogeneity. We demonstrate that the undetected presence of a duplication distorts transmission ratios, hampers fine localization of the disease gene, and increases false evidence of linkage heterogeneity. In addition, we devised a likelihood-based method for detecting the presence of a tandemly duplicated marker when one is suspected. We tested our methods through computer simulations and on CMT1A pedigrees genotyped at several chromosome 17 markers. On the simulated data, our method detected 96% of duplicated markers (with a false-positive rate of 5%). On the CMT1A data our method successfully identified two of three loci that are duplicated (with no false positives). This method could be used to identify duplicated markers in other regions of the genome and could be used to delineate the extent of duplications similar to that involved in CMT1A.  相似文献   

19.
Single nucleotide polymorphisms (SNPs), or biallelic markers, are popular in genetic linkage studies due to their abundance in the genome, stability, and ease of scoring. We determined the 'information ratio' (IR) of closely spaced SNPs in simulated nuclear families and affected sib pairs (ASPs). (The IR is the ratio of actual average maximum lod score to the maximum lod score attainable if the marker were fully informative.) The nuclear families included parental information, whereas the ASPs did not. We analyzed these SNPs in two ways: (1) using multipoint analysis, and (2) treating the SNPs as 'composite markers' (i.e., haplotypes, as assigned by GENEHUNTER). (3) We also calculated the IR of a single microsatellite marker with multiple alleles and compared with the IR from the SNPs. For each set of input conditions, we simulated 1000 nuclear families, of 2, 3, 4, or 5 children each, as well as 1000 ASPs. We generated SNP marker data for strings of k = 1, 2, 3, 5, 7, and 10 SNP loci, with no recombination (theta = 0) and no linkage disequilibrium among the SNPs. The MAF (minor allele frequency) was either 0.5 or 0.25, and allele frequencies were the same for all k loci in any analysis. We also generated marker data for one single-locus microsatellite marker, with m = 3, 4, 5, 6, 7, and 9 equally frequent alleles. In all simulations, the disease was fully penetrant dominant, and there was no recombination or linkage disequilibrium among markers or between marker and disease. When multipoint analysis was used, we found that 5-7 closely spaced SNPs were usually enough to yield an IR of approximately 100%, for nuclear families of any size. However, for the ASPs, even 7-10 SNPs yielded an IR of only 70-80%. A microsatellite with 9 equally frequent alleles yielded about the same IR (86-88%) as a string of 4-5 SNPs, in nuclear families. SNPs analyzed as 'composite markers' analyses performed worse, due to the inherent ambiguity of SNP haplotyping.  相似文献   

20.
As the extent of human genetic variation becomes more fully characterized, the research community is faced with the challenging task of using this information to dissect the heritable components of complex traits. Genomewide association studies offer great promise in this respect, but their analysis poses formidable difficulties. In this article, we describe a computationally efficient approach to mining genotype-phenotype associations that scales to the size of the data sets currently being collected in such studies. We use discrete graphical models as a data-mining tool, searching for single- or multilocus patterns of association around a causative site. The approach is fully Bayesian, allowing us to incorporate prior knowledge on the spatial dependencies around each marker due to linkage disequilibrium, which reduces considerably the number of possible graphical structures. A Markov chain-Monte Carlo scheme is developed that yields samples from the posterior distribution of graphs conditional on the data from which probabilistic statements about the strength of any genotype-phenotype association can be made. Using data simulated under scenarios that vary in marker density, genotype relative risk of a causative allele, and mode of inheritance, we show that the proposed approach has better localization properties and leads to lower false-positive rates than do single-locus analyses. Finally, we present an application of our method to a quasi-synthetic data set in which data from the CYP2D6 region are embedded within simulated data on 100K single-nucleotide polymorphisms. Analysis is quick (<5 min), and we are able to localize the causative site to a very short interval.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号