首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Linkage disequilibrium arising from the recent admixture of genetically distinct populations can be potentially useful in mapping genes for complex diseases. McKeigue has proposed a method that conditions on parental admixture to detect linkage. We show that this method tests for linkage only under specific assumptions, such as equal admixture in the parental generation and admixture that occurs in a single generation. In practice, these assumptions are unlikely to hold for natural populations, resulting in an inflation of the type I error rate when testing for linkage by this method. In this article, we generalize McKeigue's approach of testing for linkage to allow two different admixture models: (1) intermixture admixture and (2) continuous gene flow. We calculate the sample size required for a genomewide search by this method under different disease models: multiplicative, additive, recessive, and dominant. Our results show that the sample size required to obtain 90% power to detect a putative mutant allele at a genomewide significance level of 5% can usually be achieved in practice if informative markers are available at a density of 2 cM.  相似文献   

2.
Linkage disequilibrium (LD) testing has become a popular and effective method of fine-scale disease-gene localization. It has been proposed that LD testing could also be used for genome screening, particularly as dense maps of diallelic markers become available and automation allows inexpensive genotyping of diallelic markers. We compare diallelic markers and multiallelic markers in terms of sample sizes required for detection of LD, by use of a single marker locus in a case-control study, for rare monophyletic diseases with Mendelian inheritance. We extrapolate from our results to discuss the feasibility of single-marker LD screening in more-complex situations. We have used a deterministic population genetic model to calculate the expected power to detect LD as a function of marker density, age of mutation, number of marker alleles, mode of inheritance of a rare disease, and sample size. Our calculations show that multiallelic markers always have more power to detect LD than do diallelic markers (under otherwise equivalent conditions) and that the ratio of the number of diallelic to the number of multiallelic markers needed for equivalent power increases with mutation age and complexity of mode of inheritance. Power equivalent to that achieved by a multiallelic screen can theoretically be achieved by use of a more dense diallelic screen, but mapping panels of the necessary resolution are not currently available and may be difficult to achieve. Genome screening that uses single-marker LD testing may therefore be feasible only for young (<20 generations), rare, monophyletic Mendelian diseases, such as may be found in rapidly growing genetic isolates.  相似文献   

3.
Cui Y  Kang G  Sun K  Qian M  Romero R  Fu W 《Genetics》2008,179(1):637-650
Genes are the functional units in most organisms. Compared to genetic variants located outside genes, genic variants are more likely to affect disease risk. The development of the human HapMap project provides an unprecedented opportunity for genetic association studies at the genomewide level for elucidating disease etiology. Currently, most association studies at the single-nucleotide polymorphism (SNP) or the haplotype level rely on the linkage information between SNP markers and disease variants, with which association findings are difficult to replicate. Moreover, variants in genes might not be sufficiently covered by currently available methods. In this article, we present a gene-centric approach via entropy statistics for a genomewide association study to identify disease genes. The new entropy-based approach considers genic variants within one gene simultaneously and is developed on the basis of a joint genotype distribution among genetic variants for an association test. A grouping algorithm based on a penalized entropy measure is proposed to reduce the dimension of the test statistic. Type I error rates and power of the entropy test are evaluated through extensive simulation studies. The results indicate that the entropy test has stable power under different disease models with a reasonable sample size. Compared to single SNP-based analysis, the gene-centric approach has greater power, especially when there is more than one disease variant in a gene. As the genomewide genic SNPs become available, our entropy-based gene-centric approach would provide a robust and computationally efficient way for gene-based genomewide association study.  相似文献   

4.
Family-based association tests for genomewide association scans   总被引:7,自引:1,他引:6       下载免费PDF全文
With millions of single-nucleotide polymorphisms (SNPs) identified and characterized, genomewide association studies have begun to identify susceptibility genes for complex traits and diseases. These studies involve the characterization and analysis of very-high-resolution SNP genotype data for hundreds or thousands of individuals. We describe a computationally efficient approach to testing association between SNPs and quantitative phenotypes, which can be applied to whole-genome association scans. In addition to observed genotypes, our approach allows estimation of missing genotypes, resulting in substantial increases in power when genotyping resources are limited. We estimate missing genotypes probabilistically using the Lander-Green or Elston-Stewart algorithms and combine high-resolution SNP genotypes for a subset of individuals in each pedigree with sparser marker data for the remaining individuals. We show that power is increased whenever phenotype information for ungenotyped individuals is included in analyses and that high-density genotyping of just three carefully selected individuals in a nuclear family can recover >90% of the information available if every individual were genotyped, for a fraction of the cost and experimental effort. To aid in study design, we evaluate the power of strategies that genotype different subsets of individuals in each pedigree and make recommendations about which individuals should be genotyped at a high density. To illustrate our method, we performed genomewide association analysis for 27 gene-expression phenotypes in 3-generation families (Centre d'Etude du Polymorphisme Humain pedigrees), in which genotypes for ~860,000 SNPs in 90 grandparents and parents are complemented by genotypes for ~6,700 SNPs in a total of 168 individuals. In addition to increasing the evidence of association at 15 previously identified cis-acting associated alleles, our genotype-inference algorithm allowed us to identify associated alleles at 4 cis-acting loci that were missed when analysis was restricted to individuals with the high-density SNP data. Our genotype-inference algorithm and the proposed association tests are implemented in software that is available for free.  相似文献   

5.
Linkage of familial Hibernian fever to chromosome 12p13.   总被引:2,自引:0,他引:2  
Autosomal dominant periodic fevers are characterized by intermittent febrile attacks of unknown etiology and by recurrent abdominal pains. The biochemical and molecular bases of all autosomal dominant periodic fevers are unknown, and only familial Hibernian fever (FHF) has been described as a distinct clinical entity. FHF has been reported in three families-the original Irish-Scottish family and two Irish families with similar clinical features. We have undertaken a genomewide search in these families and report significant multipoint LOD scores between the disease and markers on chromosome 12p13. Cumulative multipoint linkage analyses indicate that an FHF gene is likely to be located in an 8-cM interval between D12S77 and D12S356, with a maximum LOD score (Z max) of 3.79. The two-point Z max was 3.11, for D12S77. There was no evidence of genetic heterogeneity in these three families; it is proposed that these markers should be tested in other families, of different background, that have autosomal dominant periodic fever, as a prelude to identification of the FHF-susceptibility gene.  相似文献   

6.
The association between major histocompatibility (MH) polymorphism and the severity of infection by amoebic gill disease (AGD) was investigated across 30 full sibling families of Atlantic salmon. Individuals were challenged with AGD for 19days and then their severity of infection scored by histopathological examination of the gills. Fish were then genotyped for the MH class I (Sasa-UBA) and MH class II alpha (Sasa-DAA) genes using polymorphic repeats embedded within the 3' untranslated regions of the Sasa-UBA and Sasa-DAA genes. High variation in the severity of infection was observed across the sample material, ranging from 0% to 85% gill filaments infected. In total, seven Sasa-DAA-3UTR and ten Sasa-UBA-3UTR marker alleles were identified across the 30 families. A significant association between the marker allele Sasa-DAA-3UTR 239 and a reduction in AGD severity was detected. There was also a significant association found between AGD severity and the presence of two Sasa-DAA-3UTR genotypes. While the associations between MH allele/genotypes and AGD severity reported herein may be statistically significant, the small sample sizes observed for some alleles and genotypes means these associations should be considered as suggestive and future research is required to verify their biological significance.  相似文献   

7.
The study of genetic linkage or association in complex traits requires large sample sizes, as the expected effect sizes are small and extremely low significance levels need to be adopted. One possible way to reduce the numbers of phenotypings and genotypings is the use of a sequential study design. Here, average sample sizes are decreased by conducting interim analyses with the possibility to stop the investigation early if the result is significant. We applied optimized group sequential study designs to the analysis of genetic linkage (one-sided mean test) and association (two-sided transmission/disequilibrium test). For designs with two and three stages at overall significance levels of.05 and.0001 and a power of.8, we calculated necessary sample sizes, time points, and critical boundaries for interim and final analyses. Monte Carlo simulation analyses were performed to confirm the validity of the asymptotic approximation. Furthermore, we calculated average sample sizes required under the null and alternative hypotheses in the different study designs. It was shown that the application of a group sequential design led to a maximal increase in sample size of 8% under the null hypothesis, compared with the fixed-sample design. This was contrasted by savings of up to 20% in average sample sizes under the alternative hypothesis, depending on the applied design. These savings affect the amounts of genotyping and phenotyping required for a study and therefore lead to a significant decrease in cost and time.  相似文献   

8.
We consider the problem of how to detect cognate pairs of proteins that bind when each belongs to a large family of paralogs. To illustrate the problem, we have undertaken a genomewide analysis of interactions of members of the PE and PPE protein families of Mycobacterium tuberculosis. Our computational method uses structural information, operon organization, and protein coevolution to infer the interaction of PE and PPE proteins. Some 289 PE/PPE complexes were predicted out of a possible 5,590 PE/PPE pairs genomewide. Thirty-five of these predicted complexes were also found to have correlated mRNA expression, providing additional evidence for these interactions. We show that our method is applicable to other protein families, by analyzing interactions of the Esx family of proteins. Our resulting set of predictions is a starting point for genomewide experimental interaction screens of the PE and PPE families, and our method may be generally useful for detecting interactions of proteins within families having many paralogs.  相似文献   

9.
Replication of linkage results for complex traits has been exceedingly difficult, owing in part to the inability to measure the precise underlying phenotype, small sample sizes, genetic heterogeneity, and statistical methods employed in analysis. Often, in any particular study, multiple correlated traits have been collected, yet these have been analyzed independently or, at most, in bivariate analyses. Theoretical arguments suggest that full multivariate analysis of all available traits should offer more power to detect linkage; however, this has not yet been evaluated on a genomewide scale. Here, we conduct multivariate genomewide analyses of quantitative-trait loci that influence reading- and language-related measures in families affected with developmental dyslexia. The results of these analyses are substantially clearer than those of previous univariate analyses of the same data set, helping to resolve a number of key issues. These outcomes highlight the relevance of multivariate analysis for complex disorders for dissection of linkage results in correlated traits. The approach employed here may aid positional cloning of susceptibility genes in a wide spectrum of complex traits.  相似文献   

10.
Oil palm (Elaeis guineensis Jacq.) requires 19 years per cycle of phenotypic selection. The use of molecular markers may reduce the generation interval and the cost of oil-palm breeding. Our objectives were to compare, by simulation, the response to phenotypic selection, marker-assisted recurrent selection (MARS), and genomewide selection with small population sizes in oil palm, and assess the efficiency of each method in terms of years and cost per unit gain. Markers significantly associated with the trait were used to calculate the marker scores in MARS, whereas all markers were used (without significance tests) to calculate the marker scores in genomewide selection. Responses to phenotypic selection and genomewide selection were consistently greater than the response to MARS. With population sizes of N = 50 or 70, responses to genomewide selection were 4–25% larger than the corresponding responses to phenotypic selection, depending on the heritability and number of quantitative trait loci. Cost per unit gain was 26–57% lower with genomewide selection than with phenotypic selection when markers cost US $1.50 per data point, and 35–65% lower when markers cost $0.15 per data point. With population sizes of N = 50 or 70, time per unit gain was 11–23 years with genomewide selection and 14–25 years with phenotypic selection. We conclude that for a realistic yet relatively small population size of N = 50 in oil palm, genomewide selection is superior to MARS and phenotypic selection in terms of gain per unit cost and time. Our results should be generally applicable to other tree species that are characterized by long generation intervals, high costs of maintaining breeding plantations, and small population sizes in selection programs.  相似文献   

11.
One of the major challenges facing genome-scan studies to discover disease genes is the assessment of the genomewide significance. The assessment becomes particularly challenging if the scan involves a large number of markers collected from a relatively small number of meioses. Typically, this assessment has two objectives: to assess genomewide significance under the null hypothesis of no linkage and to evaluate true-positive and false-positive prediction error rates under alternative hypotheses. The distinction between these goals allows one to formulate the problem in the well-established paradigm of statistical hypothesis testing. Within this paradigm, we evaluate the traditional criterion of LOD score 3.0 and a recent suggestion of LOD score 3.6, using the Monte Carlo simulation method. The Monte Carlo experiments show that the type I error varies with the chromosome length, with the number of markers, and also with sample sizes. For a typical setup with 50 informative meioses on 50 markers uniformly distributed on a chromosome of average length (i.e., 150 cM), the use of LOD score 3.0 entails an estimated chromosomewide type I error rate of.00574, leading to a genomewide significance level >.05. In contrast, the corresponding type I error for LOD score 3.6 is.00191, giving a genomewide significance level of slightly <.05. However, with a larger sample size and a shorter chromosome, a LOD score between 3.0 and 3.6 may be preferred, on the basis of proximity to the targeted type I error. In terms of reliability, these two LOD-score criteria appear not to have appreciable differences. These simulation experiments also identified factors that influence power and reliability, shedding light on the design of genome-scan studies.  相似文献   

12.
In two previous articles, we have considered sample sizes required to detect linkage for mapping quantitative-trait loci in humans, using extreme discordant sib pairs. Here, we examine further the use of extreme concordant sib pairs but consider the effect of parents' phenotypes. Sample sizes necessary to obtain a power of 80% with concordant sib pairs at a significance level of .0001 are given, stratified by parental phenotypes. When there is no residual correlation between sibs, the parental phenotypes have little impact on the sample sizes. When residual correlations between sibs exist, we show, however, that power can be considerably reduced by including extreme sib pairs when the parents also have similarly extreme values. Thus, we recommend the exclusion of such pairs from linkage studies. This recommendation reduces the required sample sizes by 3- to 28-fold. The degree of saving in the required sample sizes varies among different models and allele frequencies. The reduction is most dramatic (a 28-fold reduction) for a rare recessive gene.  相似文献   

13.
Proteins     
Proteins continue to surprise and amaze us in the myriad of ways in which they achieve biological function. The Proteins section in this issue of Current Opinion in Structural Biology highlights several proteins in which large conformational changes and evolutionary divergence in structure and function, play essential roles in their adaptation to a variety of biological functions. In addition, fundamental advances have been made in research, spurred on by industrial interest in the use of proteins as drug targets or as catalysts. All of the reviews in this section document the fact that multiple crystal structures of a protein in different functional states, and of different members of protein families, are necessary for the composition of a complete structural picture.  相似文献   

14.
Genomewide Scan of Multiple Sclerosis in Finnish Multiplex Families   总被引:13,自引:3,他引:10       下载免费PDF全文
Multiple sclerosis (MS) is a neurological, demyelinating disorder with a putative autoimmune etiology. It is thought to be a multifactorial disease with a complex mode of inheritance. Here we report the results of a two-stage genomewide scan for loci predisposing to MS. The first stage of the screen, with a low-resolution map, was performed in a selection of 16 pedigrees collected from an isolated Finnish population. Multipoint, non-parametric linkage analysis of the 328 markers did not reveal statistically significant results. However, 10 slightly interesting regions (P = .1-.15) emerged, including our previous findings of the HLA complex on 6p21 and a putative locus on 5p14-p12. Eight of these novel regions were further analyzed by use of denser marker maps, in the second stage of the scan. For the chromosomal regions 4cen, 11tel, and 17q, the statistical significance increased, but not conclusively; for 2q32 and 10q21, the statistical significance did not change. Accordingly, genotyping of the high-density markers in these regions was performed, and the data were analyzed by use of two-point, parametric linkage analysis using the complete pedigree information of the 21 Finnish multiplex families. We detected suggestive evidence for a predisposing locus on chromosomal region 17q22-q24. Several markers on 17q22-q24 yielded positive LOD scores, with the maximum LOD score (Zmax) occurring with D17S807 (Zmax = 2.8, theta = .04; dominant model). Interestingly, a suggestive linkage between MS and the markers on 17q22-q24 was also revealed by a recent genomewide scan in MS families from the United Kingdom.  相似文献   

15.
Peutz-Jeghers syndrome (PJS) is an autosomal dominant disease with variable expression and incomplete penetrance, characterized by mucocutaneous pigmentation and hamartomatous polyposis. Patients with PJS have increased frequency of gastrointestinal and extraintestinal malignancies (ovaries, testes, and breast). In order to map the locus (or loci) associated with PJS, we performed a genomewide linkage analysis, using DNA polymorphisms in six families (two from Spain, two from India, one from the United States, and one from Portugal) comprising a total of 93 individuals, including 39 affected and 48 unaffected individuals and 6 individuals with unknown status. During this study, localization of a PJS gene to 19p13.3 (around marker D19S886) had been reported elsewhere. For our families, marker D19S886 yielded a maximum LOD score of 4.74 at a recombination fraction (theta) of .045; multipoint linkage analysis resulted in a LOD score of 7.51 for the interval between D19S886 and 19 pter. However, markers on 19q13.4 also showed significant evidence for linkage. For example, D19S880 resulted in a maximum LOD score of 3.8 at theta = .13. Most of this positive linkage was contributed by a single family, PJS07. These results confirm the mapping of a common PJS locus on 19p13.3 but also suggest the existence, in a minority of families, of a potential second PJS locus, on 19q13.4. Positional cloning and characterization of the PJS mutations will clarify the genetics of the syndrome and the implication of the gene(s) in the predisposition to neoplasias.  相似文献   

16.
The sibship disequilibrium test (SDT) is designed to detect both linkage in the presence of association and association in the presence of linkage (linkage disequilibrium). The test does not require parental data but requires discordant sibships with at least one affected and one unaffected sibling. The SDT has many desirable properties: it uses all the siblings in the sibship; it remains valid if there are misclassifications of the affectation status; it does not detect spurious associations due to population stratification; asymptotically it has a chi2 distribution under the null hypothesis; and exact P values can be easily computed for a biallelic marker. We show how to extend the SDT to markers with multiple alleles and how to combine families with parents and data from discordant sibships. We discuss the power of the test by presenting sample-size calculations involving a complex disease model, and we present formulas for the asymptotic relative efficiency (which is approximately the ratio of sample sizes) between SDT and the transmission/disequilibrium test (TDT) for special family structures. For sib pairs, we compare the SDT to a test proposed both by Curtis and, independently, by Spielman and Ewens. We show that, for discordant sib pairs, the SDT has good power for testing linkage disequilibrium relative both to Curtis''s tests and to the TDT using trios comprising an affected sib and its parents. With additional sibs, we show that the SDT can be more powerful than the TDT for testing linkage disequilibrium, especially for disease prevalence >.3.  相似文献   

17.
For the analysis of affected sib pairs (ASPs), a variety of test statistics is applied in genomewide scans with microsatellite markers. Even in multipoint analyses, these statistics might not fully exploit the power of a given sample, because they do not account for incomplete informativity of an ASP. For meta-analyses of linkage and association studies, it has been shown recently that weighting by informativity increases statistical power. With this idea in mind, the first aim of this article was to introduce a new class of tests for ASPs that are based on the mean test. To take into account how much informativity an ASP contributes, we weighted families inversely proportional to their marker informativity. The weighting scheme is obtained by use of the de Finetti representation of the distribution of identity-by-descent values. We derive the limiting distribution of the weighted mean test and demonstrate the validity of the proposed test. We show that it can be much more powerful than the classical mean test in the case of low marker informativity. In the second part of the article, we propose a Monte Carlo simulation approach for evaluating significance among ASPs. We demonstrate the validity of the simulation approach for both the classical and the weighted mean test. Finally, we illustrate the use of the weighted mean test by reanalyzing two published data sets. In both applications, the maximum LOD score of the weighted mean test is 0.6 higher than that of the classical mean test.  相似文献   

18.
Genomewide association studies have been advocated as a promising alternative to genomewide linkage scans for detection of small-effect genes in complex diseases. Comparisons of power and sample size between the two strategies have shown considerable advantages for the association studies. These comparisons assume that the set of markers includes the exact disease-related polymorphism. A concern, however, is that the power of an association study decreases when this is not the case, because of discrepant allele frequencies and less-than-maximum disequilibrium between the disease-related polymorphism and its nearest marker. Here, we quantify this concern by comparing the sample sizes needed by the two strategies when the markers exclude the disease-related polymorphism. For affected sib pairs and their parents, we found that incomplete disequilibrium and differing allele frequencies can have substantial negative impact on the power of association studies, resulting, in some circumstances, in little gain and even in loss of power, compared with linkage analysis. We provide some guidelines for choosing between strategies, for the detection of genes for complex diseases.  相似文献   

19.
Genetic heterogeneity could reduce the power of linkage analysis to detect risk loci for complex traits such as alcohol dependence (AD). Previously, we performed a genomewide linkage analysis for AD in African-Americans (AAs) (Biol Psychiatry 65:111–115, 2009). The power of that linkage analysis could have been reduced by the presence of genetic heterogeneity owing to differences in admixture among AA families. We hypothesized that by examining a study sample whose genetic ancestry was more homogeneous, we could increase the power to detect linkage. To test this hypothesis, we performed ordered subset linkage analysis in 384 AA families using admixture proportion as a covariate to identify a more homogeneous subset of families and determine whether there is increased evidence for linkage with AD. Statistically significant increases in lod scores in subsets relative to the overall sample were identified on chromosomes 4 (P = 0.0001), 12 (P = 0.021), 15 (P = 0.026) and 22 (P = 0.0069). In a subset of 44 families with African ancestry proportions ranging from 0.858 to 0.996, we observed a genomewide significant linkage at 180 cM on chromosome 4 (lod = 4.24, pointwise P < 0.00001, empirical genomewide P = 0.008). A promising candidate gene located there, GLRA3, which encodes a subunit of the glycine neurotransmitter receptor. Our results demonstrate that admixture proportion can be used as a covariate to reduce genetic heterogeneity and enhance the detection of linkage for AD in an admixed population such as AAs. This approach could be applied to any linkage analysis for complex traits conducted in an admixed population.  相似文献   

20.
Design and analysis methods are presented for studying the association of a candidate gene with a disease by using parental data in place of nonrelated controls. This alternative design eliminates spurious differences in allele frequencies between cases and nonrelated controls resulting from different ethnic origins and population stratification for these two groups. We present analysis methods which are based on two genetic relative risks: (1) the relative risk of disease for homozygotes with two copies of the candidate gene versus homozygotes without the candidate gene and (2) the relative risk for heterozygotes with one copy of the candidate gene versus homozygotes without the candidate gene. In addition to estimating the magnitude of these relative risks, likelihood methods allow specific hypotheses to be tested, namely, a test for overall association of the candidate gene with disease, as well as specific genetic hypotheses, such as dominant or recessive inheritance. Two likelihood methods are presented: (1) a likelihood method appropriate when Hardy-Weinberg equilibrium holds and (2) a likelihood method in which we condition on parental genotype data when Hardy-Weinberg equilibrium does not hold. The results for the relative efficiency of these two methods suggest that the conditional approach may at times be preferable, even when equilibrium holds. Sample-size and power calculations are presented for a multitiered design. The purpose of tier 1 is to detect the presence of an abnormal sequence for a postulated candidate gene among a small group of cases. The purpose of tier 2 is to test for association of the abnormal variant with disease, such as by the likelihood methods presented. The purpose of tier 3 is to confirm positive results from tier 2. Results indicate that required sample sizes are smaller when expression of disease is recessive, rather than dominant, and that, for recessive disease and large relative risks, necessary sample sizes may be feasible, even if only a small percentage of the disease can be attributed to the candidate gene.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号