首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The problem of ascertainment for linkage analysis.   总被引:2,自引:0,他引:2       下载免费PDF全文
It is generally believed that ascertainment corrections are unnecessary in linkage analysis, provided individuals are selected for study solely on the basis of trait phenotype and not on the basis of marker genotype. The theoretical rationale for this is that standard linkage analytic methods involve conditioning likelihoods on all the trait data, which may be viewed as an application of the ascertainment assumption-free (AAF) method of Ewens and Shute. In this paper, we show that when the observed pedigree structure depends on which relatives within a pedigree happen to have been the probands (proband-dependent, or PD, sampling) conditioning on all the trait data is not a valid application of the AAF method and will result in asymptotically biased estimates of genetic parameters (except under single ascertainment). Furthermore, this result holds even if the recombination fraction R is the only parameter of interest. Since the lod score is proportional to the likelihood of the marker data conditional on all the trait data, this means that when data are obtained under PD sampling the lod score will yield asymptotically biased estimates of R, and that so-called mod scores (i.e., lod scores maximized over both R and parameters theta of the trait distribution) will yield asymptotically biased estimates of R and theta. Furthermore, the problem appears to be intractable, in the sense that it is not possible to formulate the correct likelihood conditional on observed pedigree structure. In this paper we do not investigate the numerical magnitude of the bias, which may be small in many situations. On the other hand, virtually all linkage data sets are collected under PD sampling. Thus, the existence of this bias will be the rule rather than the exception in the usual applications.  相似文献   

2.
3.
Power to detect risk alleles using genome-wide tag SNP panels   总被引:1,自引:0,他引:1       下载免费PDF全文
Advances in high-throughput genotyping and the International HapMap Project have enabled association studies at the whole-genome level. We have constructed whole-genome genotyping panels of over 550,000 (HumanHap550) and 650,000 (HumanHap650Y) SNP loci by choosing tag SNPs from all populations genotyped by the International HapMap Project. These panels also contain additional SNP content in regions that have historically been overrepresented in diseases, such as nonsynonymous sites, the MHC region, copy number variant regions and mitochondrial DNA. We estimate that the tag SNP loci in these panels cover the majority of all common variation in the genome as measured by coverage of both all common HapMap SNPs and an independent set of SNPs derived from complete resequencing of genes obtained from SeattleSNPs. We also estimate that, given a sample size of 1,000 cases and 1,000 controls, these panels have the power to detect single disease loci of moderate risk (λ ~ 1.8–2.0). Relative risks as low as λ ~ 1.1–1.3 can be detected using 10,000 cases and 10,000 controls depending on the sample population and disease model. If multiple loci are involved, the power increases significantly to detect at least one locus such that relative risks 20%–35% lower can be detected with 80% power if between two and four independent loci are involved. Although our SNP selection was based on HapMap data, which is a subset of all common SNPs, these panels effectively capture the majority of all common variation and provide high power to detect risk alleles that are not represented in the HapMap data.  相似文献   

4.
5.
Genomewide association studies have been advocated as a promising alternative to genomewide linkage scans for detection of small-effect genes in complex diseases. Comparisons of power and sample size between the two strategies have shown considerable advantages for the association studies. These comparisons assume that the set of markers includes the exact disease-related polymorphism. A concern, however, is that the power of an association study decreases when this is not the case, because of discrepant allele frequencies and less-than-maximum disequilibrium between the disease-related polymorphism and its nearest marker. Here, we quantify this concern by comparing the sample sizes needed by the two strategies when the markers exclude the disease-related polymorphism. For affected sib pairs and their parents, we found that incomplete disequilibrium and differing allele frequencies can have substantial negative impact on the power of association studies, resulting, in some circumstances, in little gain and even in loss of power, compared with linkage analysis. We provide some guidelines for choosing between strategies, for the detection of genes for complex diseases.  相似文献   

6.

Background

We investigate the power of heterogeneity LOD test to detect linkage when a trait is determined by several major genes using Genetic Analysis Workshop 13 simulated data. We consider three traits, two of which are disease-causing traits: 1) the rate of change in body mass index (BMI); and 2) the maximum BMI; and 3) the disease itself (hypertension). Of interest is the power of "HLOD2", the maximum heterogeneity LOD obtained upon maximizing over the two genetic models.

Results

Using a trait phenotype Obesity Slope, we observe that the power to detect the two markers closest to the two genes (S1, S2) at the 0.05 level using HLOD2 is 13% and 10%. The power of HLOD2 for Max BMI phenotype is 12% and 9%. The corresponding values for the Hypertension phenotype are 8% and 6%.

Conclusion

The power to detect linkage to the slope genes is quite low. But the power using disease-related traits as a phenotype is greater than the power using the disease (hypertension) phenotype.
  相似文献   

7.
8.
As large-scale sequencing efforts turn from single genome sequencing to polymorphism discovery, single nucleotide polymorphisms (SNPs) are becoming an increasingly important class of population genetic data. But because of the ascertainment biases introduced by many methods of SNP discovery, most SNP data cannot be analyzed using classical population genetic methods. Statistical methods must instead be developed that can explicitly take into account each method of SNP discovery. Here we review some of the current methods for analyzing SNPs and derive sampling distributions for single SNPs and pairs of SNPs for some common SNP discovery schemes. We also show that the ascertainment scheme has a large effect on the estimation of linkage disequilibrium and recombination, and describe some methods of correcting for ascertainment biases when estimating recombination rates from SNP data.  相似文献   

9.
Zaykin DV  Pudovkin A  Weir BS 《Genetics》2008,180(1):533-545
The correlation between alleles at a pair of genetic loci is a measure of linkage disequilibrium. The square of the sample correlation multiplied by sample size provides the usual test statistic for the hypothesis of no disequilibrium for loci with two alleles and this relation has proved useful for study design and marker selection. Nevertheless, this relation holds only in a diallelic case, and an extension to multiple alleles has not been made. Here we introduce a similar statistic, R(2), which leads to a correlation-based test for loci with multiple alleles: for a pair of loci with k and m alleles, and a sample of n individuals, the approximate distribution of n(k - 1)(m - 1)/(km)R(2) under independence between loci is chi((k-1)(m-1))(2). One advantage of this statistic is that it can be interpreted as the total correlation between a pair of loci. When the phase of two-locus genotypes is known, the approach is equivalent to a test for the overall correlation between rows and columns in a contingency table. In the phase-known case, R(2) is the sum of the squared sample correlations for all km 2 x 2 subtables formed by collapsing to one allele vs. the rest at each locus. We examine the approximate distribution under the null of independence for R(2) and report its close agreement with the exact distribution obtained by permutation. The test for independence using R(2) is a strong competitor to approaches such as Pearson's chi square, Fisher's exact test, and a test based on Cressie and Read's power divergence statistic. We combine this approach with our previous composite-disequilibrium measures to address the case when the genotypic phase is unknown. Calculation of the new multiallele test statistic and its P-value is very simple and utilizes the approximate distribution of R(2). We provide a computer program that evaluates approximate as well as "exact" permutational P-values.  相似文献   

10.
Genome-wide association studies have been instrumental in identifying genetic variants associated with complex traits such as human disease or gene expression phenotypes. It has been proposed that extending existing analysis methods by considering interactions between pairs of loci may uncover additional genetic effects. However, the large number of possible two-marker tests presents significant computational and statistical challenges. Although several strategies to detect epistasis effects have been proposed and tested for specific phenotypes, so far there has been no systematic attempt to compare their performance using real data. We made use of thousands of gene expression traits from linkage and eQTL studies, to compare the performance of different strategies. We found that using information from marginal associations between markers and phenotypes to detect epistatic effects yielded a lower false discovery rate (FDR) than a strategy solely using biological annotation in yeast, whereas results from human data were inconclusive. For future studies whose aim is to discover epistatic effects, we recommend incorporating information about marginal associations between SNPs and phenotypes instead of relying solely on biological annotation. Improved methods to discover epistatic effects will result in a more complete understanding of complex genetic effects.  相似文献   

11.
Under the assumption of two family types, one with linkage and one without linkage, the number of phase-known double-backcross families required to detect heterogeneity is investigated. The case of testing for heterogeneity with two offspring per family is shown to be formally equivalent to testing for inbreeding effects in a sample of unrelated individuals.  相似文献   

12.
Results from power studies for linkage detection have led to many ongoing and planned collections of phenotypically extreme nuclear families. Given the great expense of collecting these families and the imminent availability of a dense diallelic marker map, the families are likely to be used in allelic-association as well as linkage studies. However, optimal selection strategies for linkage may not be equally powerful for association. We examine the power to detect linkage disequilibrium for quantitative traits after phenotypic selection. The results encompass six selection strategies that are in widespread use, including single selection (two designs), affected sib pairs, concordant and discordant pairs, and the extreme-concordant and -discordant design. Selection of sibships on the basis of one extreme proband with high or low trait scores provides as much power as discordant sib pairs but requires the screening and phenotyping of substantially fewer initial families from which to select. Analysis of the role of allele frequencies within each selection design indicates that common trait alleles generally offer the most power, but similarities between the marker- and trait-allele frequencies are much more important than the trait-locus frequency alone. Some of the most widespread selection designs, such as single selection, yield power gains only when both the marker and quantitative trait loci (QTL) are relatively rare in the population. In contrast, discordant pairs and the extreme-proband design provide power for the broadest range of QTL-marker-allele frequency differences. Overall, proband selection from either tail provides the best balance of power, robustness, and simplicity of ascertainment for family-based association analysis.  相似文献   

13.
BackgroundAscertaining incident cancers is a critical component of cancer-focused epidemiologic cohorts and of cancer prevention trials. Potential methods: for cancer case ascertainment include active follow-up and passive linkage with state cancer registries. Here we compare the two approaches in a large cancer screening trial.MethodsThe Prostate, Lung, Colorectal and Ovarian (PLCO) cancer screening trial enrolled 154,955 subjects at ten U.S. centers and followed them for all-cancer incidence. Cancers were ascertained by an active follow-up process involving annual questionnaires, retrieval of records and medical record abstracting to ascertain and confirm cancers. For a subset of centers, linkage with state cancer registries was also performed. We assessed the agreement of the two methods in ascertaining incident cancers from 1993 to 2009 in 80,083 subjects from six PLCO centers where cancers were ascertained both by active follow-up and through linkages with 14 state registries.ResultsThe ratio (times 100) of confirmed cases ascertained by registry linkage compared to active follow-up was 96.4 (95% CI: 95.1–98.2). Of cancers ascertained by either method, 86.6% and 83.5% were identified by active follow-up and by registry linkage, respectively. Of cancers missed by active follow-up, 30% were after subjects were lost to follow-up and 16% were reported but could not be confirmed. Of cancers missed by the registries, 27% were not sent to the state registry of the subject’s current address at the time of linkage.ConclusionLinkage with state registries identified a similar number of cancers as active follow-up and can be a cost-effective method to ascertain incident cancers in a large cohort.  相似文献   

14.

Background

Millions of people are infected with Trypanosoma cruzi, the causative agent of Chagas disease in Latin America. Anti-trypanosomal drug therapy can cure infected individuals, but treatment efficacy is highest early in infection. Vector control campaigns disrupt transmission of T. cruzi, but without timely diagnosis, children infected prior to vector control often miss the window of opportunity for effective chemotherapy.

Methods and Findings

We performed a serological survey in children 2–18 years old living in a peri-urban community of Arequipa, Peru, and linked the results to entomologic, spatial and census data gathered during a vector control campaign. 23 of 433 (5.3% [95% CI 3.4–7.9]) children were confirmed seropositive for T. cruzi infection by two methods. Spatial analysis revealed that households with infected children were very tightly clustered within looser clusters of households with parasite-infected vectors. Bayesian hierarchical mixed models, which controlled for clustering of infection, showed that a child''s risk of being seropositive increased by 20% per year of age and 4% per vector captured within the child''s house. Receiver operator characteristic (ROC) plots of best-fit models suggest that more than 83% of infected children could be identified while testing only 22% of eligible children.

Conclusions

We found evidence of spatially-focal vector-borne T. cruzi transmission in peri-urban Arequipa. Ongoing vector control campaigns, in addition to preventing further parasite transmission, facilitate the collection of data essential to identifying children at high risk of T. cruzi infection. Targeted screening strategies could make integration of diagnosis and treatment of children into Chagas disease control programs feasible in lower-resource settings.  相似文献   

15.
16.
A test to detect clusters of disease   总被引:2,自引:0,他引:2  
  相似文献   

17.
Stargardt disease is a recessively transmitted disease caused by mutations in the ABCR gene. Linkage disequilibrium has recently been reported between a polymorphism, 2828 A, and a common Western European founder mutation, 2588 C. Here, we confirm this linkage disequilibrium in a North American population. We also describe two complex alleles involving the 2828 A and 2588 C alterations and suggest a possible order of clinical severity of mutations identified in trans to the complex alleles. Finally, we report pseudodominance of Stargardt disease in a family with the 2588 C mutation, further supporting a high frequency of carriers for ABCR mutations in our population.  相似文献   

18.
Estimates of the degree of nonrandom association among genes (linkage disequilibrium) can provide evidence of the role of natural selection in maintaining allozyme polymorphisms in natural populations. This paper outlines the maximum likelihood procedures for such estimates based on gametic or zygotic frequencies at the level of two loci. The analysis is extended to estimating disequilibrium between three loci. In particular, the question of the sampling requirements to detect different intensities of disequilibrium is considered. It is found that relatively large samples are required to detect nonrandom association, unless gene frequencies are intermediate and disequilibrium is relatively intense. This might be one reason why cases of linkage disequilibrium have so far proved to be the exception, rather than the rule, in population studies.  相似文献   

19.
Three new allelic forms of the HLA-G DNA sequence (HLA-G*II, HLA-G*III, and HLA-G*IV) have been identified. With the HLA-G*I sequence (previously designated HLA 6.0) as a reference, HLA-G*II shows a silent (G A) mutation at the third base of codon 57, HLA-G*III bears a non-synonymous (A T), but conservative, (Thr Ser) substitution at the first base of codon 31, and HLA-G*IV shows two silent substitutions: (A T) at the third base of codon 107 and (G A) at the third base of codon 57. A rapid method of singling out each allele on genomic DNA has been developed by using polymerase chain reaction amplification followed by restriction endonuclease treatment. Also, more or less strong linkage disequilibria has been found between most HLA-A alleles and either HLA-G*I or *II, both being the most prevalent alleles in the population, with a genotypic frequency of 0.55 and 0.38, respectively; HLA-G*III is very rare and HLA-G*IV has a genotypic frequency of 0.07. An evolutive classification of HLA-A alleles results according to their association with either HLA-G*I or HLA-G*II, which does not correlate with the classical serological cross-reacting groups classification. The finding of a strong and selective A/G linkage disequilibria with most HLA-A alleles, together with the existence of less frequent random A/G associations, may suggest that there exist in different haplotypes true and varied A/G genetic distances (and not a recombinational hotspot). It may be inferred from preliminary data that in primates HLA-A/G haplotypes bearing G*II may have appeared later than those bearing G*I.The nucleotide sequence data reported in this paper have been submitted to the GenBank and EMBL nucleotide sequence databases and have been assigned the following accession numbers: EMBL-X60983 (HLA-G*II), GenBank-M99048 (HLA-G*III), and GenBank-L07784 (HLA-G*IV).The contribution to this paper by P. Morales and A. Corell is equal, and the order of authorship is arbitrary. Correspondence to: A. Arnaiz-Villena.  相似文献   

20.
Linkage analysis methods that incorporate etiological heterogeneity of complex diseases are likely to demonstrate greater power than traditional linkage analysis methods. Several such methods use covariates to discriminate between linked and unlinked pedigrees with respect to a certain disease locus. Here we apply several such methods including two mixture models, ordered subset analysis, and a conditional logistic model to genome scan data on the DSM-IV alcohol dependence phenotype on the Collaborative Studies on Genetics of Alcoholism families, and compare the results to traditional nonparametric linkage analysis. In general, there was little agreement among the various covariate-based linkage statistics. Linkage signals with empirical p-values less than 0.001 were detected on chromosomes 3, 4, 7, 10, and 12, with the highest peak occurring at the GABRB1 gene using the ecb21 covariate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号