首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The affected-pedigree-member method of linkage analysis.   总被引:67,自引:45,他引:22       下载免费PDF全文
This paper describes a generalization of the affected-sib-pair method of linkage analysis to pedigrees. By substituting identity-by-state relations for identity-by-descent relations, we develop a test statistic for detecting departures from independent segregation of disease and marker phenotypes. The statistic is based on the marker phenotypes of affected pedigree members only. Since it is more striking for distantly affected relatives to share a rare marker allele than a common marker allele, the statistic also includes a weighting factor based on allele frequency. The distributional properties of the statistic are investigated theoretically and by simulation. Part of the theoretical treatment entails generalizing Karigl's multiple-person kinship coefficients. When the test statistic is applied to pedigree data on Huntington disease, the null hypothesis of independent segregation between the marker locus and the disease locus is firmly rejected. In this case, as expected, there is a loss of power when compared with standard lod-score analysis. However, our statistic possesses the advantage of requiring no explicit assumptions about the mode of inheritance of the disease. This point is illustrated by application of the test statistic to data on rheumatoid arthritis.  相似文献   

2.
The affected-pedigree-member (APM) method of linkage analysis is designed to detect departures from independent segregation of disease and marker phenotypes. The underlying statistic of the APM method operates on the identity-by-state relations implied by the marker phenotypes of the affected within a pedigree. Here we generalize the APM statistic to multiple linked markers. This generalization relies on recursive computation of two-locus kinship coefficients by an algorithm of Thompson. The distributional properties of the extended APM statistic are investigated theoretically and by simulation in the context of one real and one artificial data set. In both examples, the multilocus statistic tends to reject, more strongly than the single-locus statistics do, the null hypothesis of independent segregation between the disease locus and the marker loci.  相似文献   

3.
In studies of complex diseases, a common paradigm is to conduct association analysis at markers in regions identified by linkage analysis, to attempt to narrow the region of interest. Family-based tests for association based on parental transmissions to affected offspring are often used in fine-mapping studies. However, for diseases with late onset, parental genotypes are often missing. Without parental genotypes, family-based tests either compare allele frequencies in affected individuals with those in their unaffected siblings or use siblings to infer missing parental genotypes. An example of the latter approach is the score test implemented in the computer program TRANSMIT. The inference of missing parental genotypes in TRANSMIT assumes that transmissions from parents to affected siblings are independent, which is appropriate when there is no linkage. However, using computer simulations, we show that, when the marker and disease locus are linked and the data set consists of families with multiple affected siblings, this assumption leads to a bias in the score statistic under the null hypothesis of no association between the marker and disease alleles. This bias leads to an inflated type I error rate for the score test in regions of linkage. We present a novel test for association in the presence of linkage (APL) that correctly infers missing parental genotypes in regions of linkage by estimating identity-by-descent parameters, to adjust for correlation between parental transmissions to affected siblings. In simulated data, we demonstrate the validity of the APL test under the null hypothesis of no association and show that the test can be more powerful than the pedigree disequilibrium test and family-based association test. As an example, we compare the performance of the tests in a candidate-gene study in families with Parkinson disease.  相似文献   

4.
We present here four nonparametric statistics for linkage analysis that test whether pairs of affected relatives share marker alleles more often than expected. These statistics are based on simulating the null distribution of a given statistic conditional on the unaffecteds' marker genotypes. Each statistic uses a different measure of marker sharing: the SimAPM statistic uses the simulation-based affected-pedigree-member measure based on identity-by-state (IBS) sharing. The SimKIN (kinship) measure is 1.0 for identity-by-descent (IBD) sharing, 0.0 for no IBD status sharing, and the kinship coefficient when the IBD status is ambiguous. The simulation-based IBD (SimIBD) statistic uses a recursive algorithm to determine the probability of two affecteds sharing a specific allele IBD. The SimISO statistic is identical to SimIBD, except that it also measures marker similarity between unaffected pairs. We evaluated our statistics on data simulated under different two-locus disease models, comparing our results to those obtained with several other nonparametric statistics. Use of IBD information produces dramatic increases in power over the SimAPM method, which uses only IBS information. The power of our best statistic in most cases meets or exceeds the power of the other nonparametric statistics. Furthermore, our statistics perform comparisons between all affected relative pairs within general pedigrees and are not restricted to sib pairs or nuclear families.  相似文献   

5.
Investigators of genetic illnesses are currently employing life-table techniques to estimate the lifetime risk of disease and the age-at-onset distribution. This methodology assumes that onset ages are known for affected individuals and that censoring ages are known for unaffected individuals. We extend these methods to incorporate affected individuals with unknown onset ages and unaffected persons with unknown censoring ages and illustrate how conventional life-table methods can produce seriously biased estimates, particularly of lifetime risk. The methodology is not restricted to genetic illnesses and can be applied to more complex illnesses with unknown etiology. We present an example for Huntington disease, which is generally assumed to be a Mendelian autosomal dominant disease, yielding estimates of lifetime risk of .503 +/- .70 and mean onset age of 47.7 +/- 3.1 years for offspring with a single affected parent. When conventional life-table techniques are employed, these estimates are .238 +/- .032 and 43.2 +/- 2.2.  相似文献   

6.
The degree to which DNA similarity is related to kinship and population structure in natural populations was investigated for a small population of cooperatively-breeding Red-cockaded Woodpeckers (Picoides borealis) in the western Piedmont region of South Carolina. An independent pedigree was established from records of color-banded individuals. Results of DNA profiles were then examined relative to this pedigree. DNA similarity among unrelated woodpeckers averaged 0.55 ± 0.01 (SE). The mean number of DNA bands scored and similarity did not significantly differ between founders and the current population. Examination of parentage in 10 families indicated that multiple paternity did not occur when band by band comparisons or similarity values were compared among parents, helpers, and offspring. Thus, Red-cockaded Woodpeckers were monogamous in this population. DNA similarity among all individuals ranged from 0.32-0.78. Distribution of these similarity values by kinship resulted in some overlap with other kin values. Therefore, specific similarity values could not be assigned a kinship value without knowledge of the pedigree. However, least-squares linear regression indicated that similarity was significantly related to kinship (P < 0.05). These results indicate that use of DNA profiles may be important in quantifying population structure, however, they must be used in conjunction with a known pedigree before any assessment of kinship among individuals is made. Band by band comparisons remain a viable technique for examination of parentage when all putative parents have been sampled.  相似文献   

7.
DNA pooling is a potential methodology for genetic loci with small effect contributing to complex diseases and quantitative traits. This is accomplished by the rapid preliminary screening of the genome for the allelic association with the most common class of polymorphic short tandem repeat markers. The methodology assumes as a common founder for the linked disease locus of interest and searches for a region of a chromosome shared between affected individuals. The general theory of DNA pooling basically relies on the observed differences in the allelic distribution between pools from affected and unaffected individuals, including a reduction in the number of alleles in the affected pool, which indicate the sharing of a chromosomal region. The power of statistic for associated linkage mapping can be determined using two recently developed strategies, firstly, by measuring the differences of allelic image patterns produced by two DNA pools of extreme character and secondly, by measuring total allele content differences by comparing between two pools containing large numbers of DNA samples. These strategies have effectively been utilized to identify the shared chromosomal regions for linkage studies and to investigate the candidate disease loci for fine structure gene mapping using allelic association. This paper outlines the utilization of DNA pooling as a potential tool to locate the complex disease loci, statistical methods for accurate estimates of allelic frequencies from DNA pools, its advantages, drawbacks and significance in associate linkage mapping using pooled DNA samples.  相似文献   

8.
We present a class of likelihood-based score statistics that accommodate genotypes of both unrelated individuals and families, thereby combining the advantages of case-control and family-based designs. The likelihood extends the one proposed by Schaid and colleagues (Schaid and Sommer 1993, 1994; Schaid 1996; Schaid and Li 1997) to arbitrary family structures with arbitrary patterns of missing data and to dense sets of multiple markers. The score statistic comprises two component test statistics. The first component statistic, the nonfounder statistic, evaluates disequilibrium in the transmission of marker alleles from parents to offspring. This statistic, when applied to nuclear families, generalizes the transmission/disequilibrium test to arbitrary numbers of affected and unaffected siblings, with or without typed parents. The second component statistic, the founder statistic, compares observed or inferred marker genotypes in the family founders with those of controls or those of some reference population. The founder statistic generalizes the statistics commonly used for case-control data. The strengths of the approach include both the ability to assess, by comparison of nonfounder and founder statistics, the potential bias resulting from population stratification and the ability to accommodate arbitrary family structures, thus eliminating the need for many different ad hoc tests. A limitation of the approach is the potential power loss and/or bias resulting from inappropriate assumptions on the distribution of founder genotypes. The systematic likelihood-based framework provided here should be useful in the evaluation of both the relative merits of case-control and various family-based designs and the relative merits of different tests applied to the same design. It should also be useful for genotype-disease association studies done with the use of a dense set of multiple markers.  相似文献   

9.
亨廷顿病的基因诊断   总被引:2,自引:0,他引:2  
莫亚勤  李麓芸  卢光琇 《遗传》2005,27(6):861-864
为了简单高效检测HD基因开放阅读框5’端(CAG)n三核苷酸重复序列,建立快速准确的亨廷顿病(Huntington disease, HD)基因诊断方法,应用TaKaRa LA Taq DNA聚合酶配合GC buffer扩增HD基因包含(CAG)n重复序列的目的片段,非变性聚丙烯酰胺凝胶电泳检测后回收(CAG)n拷贝数异常增多的目的片段,再次PCR扩增后将产物连接至T载体,进行DNA测序确定CAG的拷贝数。应用该方法对一个HD家系的3名成员以及20名正常人进行基因诊断,结果显示该HD家系3名成员的一条染色体上的(CAG)n拷贝数在正常范围内,而另一条染色体上的(CAG)n拷贝数异常增多,分别为39、40、41,而20例正常人(CAG)n拷贝数均在正常范围内,正常和HD等位基因之间的(CAG)n拷贝数不相重叠。因此,应用该方法可以对HD进行准确的基因诊断,结果同时也证明HD基因的动态突变是导致中国人亨廷顿病的遗传基础。  相似文献   

10.
Genome-wide association studies (GWASs) are commonly used for the mapping of genetic loci that influence complex traits. A problem that is often encountered in both population-based and family-based GWASs is that of identifying cryptic relatedness and population stratification because it is well known that failure to appropriately account for both pedigree and population structure can lead to spurious association. A number of methods have been proposed for identifying relatives in samples from homogeneous populations. A strong assumption of population homogeneity, however, is often untenable, and many GWASs include samples from structured populations. Here, we consider the problem of estimating relatedness in structured populations with admixed ancestry. We propose a method, REAP (relatedness estimation in admixed populations), for robust estimation of identity by descent (IBD)-sharing probabilities and kinship coefficients in admixed populations. REAP appropriately accounts for population structure and ancestry-related assortative mating by using individual-specific allele frequencies at SNPs that are calculated on the basis of ancestry derived from whole-genome analysis. In simulation studies with related individuals and admixture from highly divergent populations, we demonstrate that REAP gives accurate IBD-sharing probabilities and kinship coefficients. We apply REAP to the Mexican Americans in Los Angeles, California (MXL) population sample of release 3 of phase III of the International Haplotype Map Project; in this sample, we identify third- and fourth-degree relatives who have not previously been reported. We also apply REAP to the African American and Hispanic samples from the Women's Health Initiative SNP Health Association Resource (WHI-SHARe) study, in which hundreds of pairs of cryptically related individuals have been identified.  相似文献   

11.
We propose a method, the maximum identity length contrast (MILC) statistic, to locate genetic risk factors for complex diseases in founder populations. The MILC approach compares the identity length of parental haplotypes that are transmitted to affected offspring with the identity length of those that are not transmitted to affected offspring. Initially, the statistical properties of the method were assessed using randomly selected affected individuals with unknown relationship. Because both nuclear families with multiple affected sibs and large pedigrees are often available in founder populations, we performed simulations to investigate the properties of the MILC statistic in the presence of closely related affected individuals. The simulation showed that the use of closely related affected individuals greatly enhances the power of the statistic. For a given sample size and type I error, the use of affected sib pairs, instead of affected individuals randomly selected from the population, could increase the power by a factor of two. This increase was related to an increase of kinship-coefficient contrast between haplotype groups when closely related individuals were considered. The MILC approach allows the simultaneous use of affected individuals from a founder population and affected individuals with any kind of relationship, close or remote. We used the MILC approach to analyze the role of HLA in celiac disease and showed that the effect of HLA may be detected with the MILC approach by typing only 11 affected individuals, who were part of a single large Finnish pedigree.  相似文献   

12.
Case-control studies compare marker-allele distributions in affected and unaffected individuals, and significant results suggest linkage but may simply reflect population structure. For markers with m alleles (m > or = 2), a McNemar-like statistic, I, estimates the level of population association between marker and disease loci. To test for linkage after significant case-control tests, within-family tests are performed. These operate on the contingency table, with i, jth element equal to the number of parents that transmit marker allele Mi and do not transmit marker allele Mi to an affected offspring. The dimension of the table is the number of alleles at the marker locus. Three test statistics have recently been proposed in the literature: Tc compares symmetric pairs of cells (i, j) and (j, i), Tm compares row and column totals for the same marker allele, and a likelihood ratio statistic Tl uses all the cells in the table. In addition, we consider a new statistic, Tmhet, that uses only the heterozygous parents and is approximately chi2 with (m - 1) df. We use a Monte Carlo test to guarantee valid tests and to demonstrate the inferiority of Tc and the equality of Tm and Tl in terms of power. The power of the Tmhet test is close but not always equal to the power of the Tm test. We also show that under the alternative hypothesis of linkage, Tm is approximately noncentral chi2 with (m - 1) df and noncentrality parameter 2NT(1 - 2theta)2I*, when data on single affecteds in NT families are used. If the disease has a low population frequency, then I* is estimated using the case-control statistic I. This offers a basis for choosing sample size, or choosing a marker system.  相似文献   

13.
The transmission/disequilibrium test (TDT) is a popular, simple, and powerful test of linkage, which can be used to analyze data consisting of transmissions to the affected members of families with any kind pedigree structure, including affected sib pairs (ASPs). Although it is based on the preferential transmission of a particular marker allele across families, it is not a valid test of association for ASPs. Martin et al. devised a similar statistic for ASPs, Tsp, which is also based on preferential transmission of a marker allele but which is a valid test of both linkage and association for ASPs. It is, however, less powerful than the TDT as a test of linkage for ASPs. What I show is that the differences between the TDT and Tsp are due to the fact that, although both statistics are based on preferential transmission of a marker allele, the TDT also exploits excess sharing in identity-by-descent transmissions to ASPs. Furthermore, I show that both of these statistics are members of a family of "TDT-like" statistics for ASPs. The statistics in this family are based on preferential transmission but also, to varying extents, exploit excess sharing. From this family of statistics, we see that, although the TDT exploits excess sharing to some extent, it is possible to do so to a greater extent-and thus produce a more powerful test of linkage, for ASPs, than is provided by the TDT. Power simulations conducted under a number of disease models are used to verify that the most powerful member of this family of TDT-like statistics is more powerful than the TDT for ASPs.  相似文献   

14.
Relationships play a very important role in studies on quantitative genetics. In traditional breeding, pedigree records are used to establish relationships between animals; while this kind of relationship actually represents one kind of relatedness, it cannot distinguish individual specificity, capture the variation between individuals or determine the actual genetic superiority of an animal. However, with the popularization of high-throughput genotypes, assessments of relationships among animals based on genomic information could be a better option. In this study, we compared the relationships between animals based on pedigree and genomic information from two pig breeding herds with different genetic backgrounds and a simulated dataset. Two different methods were implemented to calculate genomic relationship coefficients and genomic kinship coefficients, respectively. Our results show that, for the same kind of relative, the average genomic relationship coefficients (G matrix) were very close to the pedigree relationship coefficients (A matrix), and on average, the corresponding values were halved in genomic kinship coefficients (K matrix). However, the genomic relationship yielded a larger variation than the pedigree relationship, and the latter was similar to that expected for one relative with no or little variation. Two genomic relationship coefficients were highly correlated, for farm1, farm2 and simulated data, and the correlations for the parent-offspring, full-sib and half-sib were 0.95, 0.90 and 0.85; 0.93, 0.96 and 0.89; and 0.52, 0.85 and 0.77, respectively. When the inbreeding coefficient was measured, the genomic information also yielded a higher inbreeding coefficient and a larger variation than that yielded by the pedigree information. For the two genetically divergent Large White populations, the pedigree relationship coefficients between the individuals were 0, and 62 310 and 175 271 animal pairs in the G matrix and K matrix were greater than 0. Our results demonstrated that genomic information outperformed the pedigree information; it can more accurately reflect the relationships and capture the variation that is not detected by pedigree. This information is very helpful in the estimation of genomic breeding values or gene mapping. In addition, genomic information is useful for pedigree correction. Further, our findings also indicate that genomic information can establish the genetic connection between different groups with different genetic background. In addition, it can be used to provide a more accurate measurement of the inbreeding of an animal, which is very important for the assessment of a population structure and breeding plan. However, the approaches for measuring genomic relationships need further investigation.  相似文献   

15.
A consanguineous family affected by an autosomal recessive, progressive neurodegenerative Huntington-like disorder, was tested to rule out juvenile-onset Huntington disease (JHD). The disease manifests at approximately 3-4 years and is characterized by both pyramidal and extrapyramidal abnormalities, including chorea, dystonia, ataxia, gait instability, spasticity, seizures, mutism, and intellectual impairment. Brain magnetic resonance imaging (MRI) findings include progressive frontal cortical atrophy and bilateral caudate atrophy. Huntington CAG trinucleotide-repeat analyses ruled out JHD, since all affected individuals had repeat numbers within the normal range. The presence of only four recombinant events (straight theta=.2) between the disease and the Huntington locus in 20 informative meioses suggested that the disease localized to chromosome 4. Linkage was initially achieved with marker D4S2366 at 4p15.3 (LOD 3.03). High-density mapping at the linked locus resulted in homozygosity for markers D4S431 and D4S394, which span a 3-cM region. A maximum LOD score of 4.71 in the homozygous interval was obtained. Heterozygosity at the distal D4S2366 and proximal D4S2983 markers defines the maximum localization interval (7 cM). Multiple brain-related expressed sequence tags (ESTs) with no known disease association exist in the linkage interval. Among the three known genes residing in the linked interval (ACOX3, DRD5, QDPR), the most likely candidate, DRD5, encoding the dopamine receptor D5, was excluded, since all five affected family members were heterozygous for an intragenic dinucleotide repeat. The inheritance pattern and unique localization to 4p15.3 are consistent with the identification of a novel, autosomal recessive, neurodegenerative Huntington-like disorder.  相似文献   

16.
Abney M 《Genetics》2008,179(3):1577-1590
Computing identity-by-descent sharing between individuals connected through a large, complex pedigree is a computationally demanding task that often cannot be done using exact methods. What I present here is a rapid computational method for estimating, in large complex pedigrees, the probability that pairs of alleles are IBD given the single-point genotype data at that marker for all individuals. The method can be used on pedigrees of essentially arbitrary size and complexity without the need to divide the individuals into separate subpedigrees. I apply the method to do qualitative trait linkage mapping using the nonparametric sharing statistic S(pairs). The validity of the method is demonstrated via simulation studies on a 13-generation 3028-person pedigree with 700 genotyped individuals. An analysis of an asthma data set of individuals in this pedigree finds four loci with P-values <10(-3) that were not detected in prior analyses. The mapping method is fast and can complete analyses of approximately 150 affected individuals within this pedigree for thousands of markers in a matter of hours.  相似文献   

17.
A general approach to family-based examinations of association between marker alleles and traits is proposed. The approach is based on computing p values by comparing test statistics for association to their conditional distributions given the minimal sufficient statistic under the null hypothesis for the genetic model, sampling plan and population admixture. The approach can be applied with any test statistic, so any kind of phenotype and multi-allelic markers may be examined, and covariates may be included in analyses. By virtue of the conditioning, the approach results in correct type I error probabilities regardless of population admixture, the true genetic model and the sampling strategy. An algorithm for computing the conditional distributions is described, and the results of the algorithm for configurations of nuclear families are presented. The algorithm is applicable with all pedigree structures and all patterns of missing marker allele information.  相似文献   

18.
In many disease genes, a substantial fraction of all rare variants detected cannot yet be used for genetic counselling because of uncertainty about their association with disease. One approach to the characterization of these unclassified variants is the analysis of patterns of cosegregation with disease in affected carrier families. Petersen et al. previously provided a simplistic Bayesian method for evaluation of causality of such sequence variants. In the present report, we propose a more general method based on the full pedigree likelihood, and we show that the use of this method can provide more accurate and informative assessment of causality than could the previous method. We further show that it is important that the pedigree information be as complete as possible and that the distinction be made between unaffected individuals and those of unknown phenotype.  相似文献   

19.
Two likelihood-based score statistics are used to detect association between a disease and a single diallelic polymorphism, on the basis of data from arbitrary types of nuclear families. The first statistic, the nonfounder statistic, extends the transmission/disequilibrium test to accommodate affected and unaffected offspring and missing parental genotypes. The second statistic, the founder statistic, compares observed or inferred parental genotypes with those of some reference population. In this comparison, the genotypes of affected parents or of those with many affected offspring are weighted more heavily than are the genotypes of unaffected parents or of those with few affected offspring. Genotypes of single unrelated cases and controls can be included in this analysis. We illustrate the two statistics by applying them to data on a polymorphism of the SDR5A2 gene in nuclear families with multiple cases of prostate cancer. We also use simulations to compare the power of the nonfounder statistic with that of the score statistic, on the basis of the conditional logistic regression of offspring genotypes.  相似文献   

20.
One approach frequently used for identifying genetic factors involved in the process of a complex disease is the comparison of patients and controls for a number of genetic markers near a candidate gene. The analysis of such association studies raises some specific problems because of the fact that genotypic and not gametic data are generally available. We present a log-linear-model analysis providing a valid method for analyzing such studies. When studying the association of disease with one marker locus, the log-linear model allows one to test for the difference between allelic frequencies among affected and unaffected individuals, Hardy-Weinberg (H-W) equilibrium in both groups, and interaction between the association of alleles at the marker locus and disease. This interaction provides information about the dominance of the disease susceptibility locus, with dominance defined using the epidemiological notion of odds ratio. The degree of dominance measured at the marker locus depends on the strength of linkage disequilibrium between the marker locus and the disease locus. When studying the association of disease with several linked markers, the model becomes rapidly complex and uninterpretable unless it is assumed that affected and unaffected populations are in H-W equilibrium at each locus. This hypothesis must be tested before going ahead in the analysis. If it is not rejected, the log-linear model offers a stepwise method of identification of the parameters causing the difference between populations. This model can be extended to any number of loci, alleles, or populations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号