首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
A dozen genes/regions have been confirmed as genetic risk factors for oral clefts in human association and linkage studies, and animal models argue even more genes may be involved. Genomic sequencing studies should identify specific causal variants and may reveal additional genes as influencing risk to oral clefts, which have a complex and heterogeneous etiology. We conducted a whole exome sequencing (WES) study to search for potentially causal variants using affected relatives drawn from multiplex cleft families. Two or three affected second, third, and higher degree relatives from 55 multiplex families were sequenced. We examined rare single nucleotide variants (SNVs) shared by affected relatives in 348 recognized candidate genes. Exact probabilities that affected relatives would share these rare variants were calculated, given pedigree structures, and corrected for the number of variants tested. Five novel and potentially damaging SNVs shared by affected distant relatives were found and confirmed by Sanger sequencing. One damaging SNV in CDH1, shared by three affected second cousins from a single family, attained statistical significance (P = 0.02 after correcting for multiple tests). Family-based designs such as the one used in this WES study offer important advantages for identifying genes likely to be causing complex and heterogeneous disorders.  相似文献   

2.
Exome sequencing in families affected by rare genetic disorders has the potential to rapidly identify new disease genes (genes in which mutations cause disease), but the identification of a single causal mutation among thousands of variants remains a significant challenge. We developed a scoring algorithm to prioritize potential causal variants within a family according to segregation with the phenotype, population frequency, predicted effect, and gene expression in the tissue(s) of interest. To narrow the search space in families with multiple affected individuals, we also developed two complementary approaches to exome-based mapping of autosomal-dominant disorders. One approach identifies segments of maximum identity by descent among affected individuals; the other nominates regions on the basis of shared rare variants and the absence of homozygous differences between affected individuals. We showcase our methods by using exome sequence data from families affected by autosomal-dominant retinitis pigmentosa (adRP), a rare disorder characterized by night blindness and progressive vision loss. We performed exome capture and sequencing on 91 samples representing 24 families affected by probable adRP but lacking common disease-causing mutations. Eight of 24 families (33%) were revealed to harbor high-scoring, most likely pathogenic (by clinical assessment) mutations affecting known RP genes. Analysis of the remaining 17 families identified candidate variants in a number of interesting genes, some of which have withstood further segregation testing in extended pedigrees. To empower the search for Mendelian-disease genes in family-based sequencing studies, we implemented them in a cross-platform-compatible software package, MendelScan, which is freely available to the research community.  相似文献   

3.
Coeliac disease (CeD) is a highly heritable common autoimmune disease involving chronic small intestinal inflammation in response to dietary wheat. The human leukocyte antigen (HLA) region, and 40 newer regions identified by genome wide association studies (GWAS) and dense fine mapping, account for ∼40% of the disease heritability. We hypothesized that in pedigrees with multiple individuals with CeD rare [minor allele frequency (MAF) <0.5%] mutations of larger effect size (odds ratios of ∼ 2–5) might exist. We sequenced the exomes of 75 coeliac individuals of European ancestry from 55 multiply affected families. We selected interesting variants and genes for further follow up using a combination of: an assessment of shared variants between related subjects, a model-free linkage test, and gene burden tests for multiple, potentially causal, variants. We next performed highly multiplexed amplicon resequencing of all RefSeq exons from 24 candidate genes selected on the basis of the exome sequencing data in 2,248 unrelated coeliac cases and 2,230 controls. 1,335 variants with a 99.9% genotyping call rate were observed in 4,478 samples, of which 939 were present in coding regions of 24 genes (Ti/Tv 2.99). 91.7% of coding variants were rare (MAF <0.5%) and 60% were novel. Gene burden tests performed on rare functional variants identified no significant associations (p<1×10−3) in the resequenced candidate genes. Our strategy of sequencing multiply affected families with deep follow up of candidate genes has not identified any new CeD risk mutations.  相似文献   

4.
5.
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels.  相似文献   

6.
A population association has consistently been observed between insulin-dependent diabetes mellitus (IDDM) and the "class 1" alleles of the region of tandem-repeat DNA (5'' flanking polymorphism [5''FP]) adjacent to the insulin gene on chromosome 11p. This finding suggests that the insulin gene region contains a gene or genes contributing to IDDM susceptibility. However, several studies that have sought to show linkage with IDDM by testing for cosegregation in affected sib pairs have failed to find evidence for linkage. As means for identifying genes for complex diseases, both the association and the affected-sib-pairs approaches have limitations. It is well known that population association between a disease and a genetic marker can arise as an artifact of population structure, even in the absence of linkage. On the other hand, linkage studies with modest numbers of affected sib pairs may fail to detect linkage, especially if there is linkage heterogeneity. We consider an alternative method to test for linkage with a genetic marker when population association has been found. Using data from families with at least one affected child, we evaluate the transmission of the associated marker allele from a heterozygous parent to an affected offspring. This approach has been used by several investigators, but the statistical properties of the method as a test for linkage have not been investigated. In the present paper we describe the statistical basis for this "transmission test for linkage disequilibrium" (transmission/disequilibrium test [TDT]). We then show the relationship of this test to tests of cosegregation that are based on the proportion of haplotypes or genes identical by descent in affected sibs. The TDT provides strong evidence for linkage between the 5''FP and susceptibility to IDDM. The conclusions from this analysis apply in general to the study of disease associations, where genetic markers are usually closely linked to candidate genes. When a disease is found to be associated with such a marker, the TDT may detect linkage even when haplotype-sharing tests do not.  相似文献   

7.
The recent development of sequencing technology allows identification of association between the whole spectrum of genetic variants and complex diseases. Over the past few years, a number of association tests for rare variants have been developed. Jointly testing for association between genetic variants and multiple correlated phenotypes may increase the power to detect causal genes in family-based studies, but familial correlation needs to be appropriately handled to avoid an inflated type I error rate. Here we propose a novel approach for multivariate family data using kernel machine regression (denoted as MF-KM) that is based on a linear mixed-model framework and can be applied to a large range of studies with different types of traits. In our simulation studies, the usual kernel machine test has inflated type I error rates when applied directly to familial data, while our proposed MF-KM method preserves the expected type I error rates. Moreover, the MF-KM method has increased power compared to methods that either analyze each phenotype separately while considering family structure or use only unrelated founders from the families. Finally, we illustrate our proposed methodology by analyzing whole-genome genotyping data from a lung function study.  相似文献   

8.
With the rise of sequencing technologies, it is now feasible to assess the role rare variants play in the genetic contribution to complex trait variation. While some of the earlier targeted sequencing studies successfully identified rare variants of large effect, unbiased gene discovery using exome sequencing has experienced limited success for complex traits. Nevertheless, rare variant association studies have demonstrated that rare variants do contribute to phenotypic variability, but sample sizes will likely have to be even larger than those of common variant association studies to be powered for the detection of genes and loci. Large-scale sequencing efforts of tens of thousands of individuals, such as the UK10K Project and aggregation efforts such as the Exome Aggregation Consortium, have made great strides in advancing our knowledge of the landscape of rare variation, but there remain many considerations when studying rare variation in the context of complex traits. We discuss these considerations in this review, presenting a broad range of topics at a high level as an introduction to rare variant analysis in complex traits including the issues of power, study design, sample ascertainment, de novo variation, and statistical testing approaches. Ultimately, as sequencing costs continue to decline, larger sequencing studies will yield clearer insights into the biological consequence of rare mutations and may reveal which genes play a role in the etiology of complex traits.  相似文献   

9.
A comparative study of sibship tests of linkage and/or association.   总被引:4,自引:0,他引:4       下载免费PDF全文
Population-based tests of association have used data from either case-control studies or studies based on trios (affected child and parents). Case-control studies are more prone to false-positive results caused by inappropriate controls, which can occur if, for example, there is population admixture or stratification. An advantage of family-based tests is that cases and controls are well matched, but parental data may not always be available, especially for late-onset diseases. Three recent family-based tests of association and linkage utilize unaffected siblings as surrogates for untyped parents. In this paper, we propose an extension of one of these tests. We describe and compare the four tests in the context of a complex disease for both biallelic and multiallelic markers, as well as for sibships of different sizes. We also examine the consequences of having some parental data in the sample.  相似文献   

10.
Alzheimer's disease (AD) is a common and complex neurodegenerative disease. Age at onset (AAO) of AD is an important component phenotype with a genetic basis, and identification of genes in which variation affects AAO would contribute to identification of factors that affect timing of onset. Increase in AAO through prevention or therapeutic measures would have enormous benefits by delaying AD and its associated morbidities. In this paper, we performed a family‐based genome‐wide association study for AAO of late‐onset AD in whole exome sequence data generated in multigenerational families with multiple AD cases. We conducted single marker and gene‐based burden tests for common and rare variants, respectively. We combined association analyses with variance component linkage analysis, and with reference to prior studies, in order to enhance evidence of the identified genes. For variants and genes implicated by the association study, we performed a gene‐set enrichment analysis to identify potential novel pathways associated with AAO of AD. We found statistically significant association with AAO for three genes (WRN, NTN4 and LAMC3) with common associated variants, and for four genes (SLC8A3, SLC19A3, MADD and LRRK2) with multiple rare‐associated variants that have a plausible biological function related to AD. The genes we have identified are in pathways that are strong candidates for involvement in the development of AD pathology and may lead to a better understanding of AD pathogenesis.  相似文献   

11.
Suppose that many polymorphic sites have been identified and genotyped in a region showing strong linkage with a trait. A key question of interest is which site (or combination of sites) in the region influences susceptibility to the trait. We have developed a novel statistical approach to this problem, in the context of qualitative-trait mapping, in which we use linkage data to identify the polymorphic sites whose genotypes could fully explain the observed linkage to the region. The information provided by this analysis is different from that provided by tests of either linkage or association. Our approach is based on the observation that if a particular site is the only site in the region that influences the trait, then-conditional on the genotypes at that site for the affected relatives-there should be no unexplained oversharing in the region among affected individuals. We focus on the affected sib-pair study design and develop test statistics that are variations on the usual allele-sharing methods used in linkage studies. We perform hypothesis tests and derive a confidence set for the true causal polymorphic site, under the assumption that there is only one site in the region influencing the trait. Our method is appropriate under a very general model for how the site influences the trait, including epistasis with unlinked loci, correlated environmental effects within families, and gene-environment interaction. We extend our method to larger sibships and apply it to an NIDDM1 data set.  相似文献   

12.
To fine map genes, investigators often test for disease-marker association in chromosomal regions with evidence for linkage. Given a marker allele tentatively associated with disease, one would ask if this allele, or one in linkage disequilibrium (LD) with it, could account in part for the observed linkage signal. This question can be addressed by determining if families selected on the basis of the presence of the tentatively associated allele show stronger evidence of linkage as measured by increased allele sharing identical by descent (IBD) by affected family members. However, common selection strategies can be biased for or against linkage in the marker region, even given no disease-marker association. We define unbiased selection schemes and extend the definition to allow weighted selection on the basis of all genotyped family members. For affected-sibship data, we describe three genotype-based weight variables, corresponding to dominant, recessive, and additive models. We then introduce a test for association of a family weight variable with excess IBD sharing. This test allows us to determine if the linkage signal in a region can be attributed in part to the presence of a marker allele, either because of direct involvement in disease etiology or because of LD with a predisposing genetic variant. For samples of 500 affected sib pairs, the tests are powerful in detection of genotype-IBD sharing association, even for disease models with sib relative risk as low as lambda S=1.1, or when evidence for linkage is absent because of sampling variation. This makes our method a new tool for detecting linkage as well as association, especially in regions harboring a candidate gene. We have implemented these methods in the software package GIST (Genotype-IBD Sharing Test).  相似文献   

13.
The sibship disequilibrium test (SDT) is designed to detect both linkage in the presence of association and association in the presence of linkage (linkage disequilibrium). The test does not require parental data but requires discordant sibships with at least one affected and one unaffected sibling. The SDT has many desirable properties: it uses all the siblings in the sibship; it remains valid if there are misclassifications of the affectation status; it does not detect spurious associations due to population stratification; asymptotically it has a chi2 distribution under the null hypothesis; and exact P values can be easily computed for a biallelic marker. We show how to extend the SDT to markers with multiple alleles and how to combine families with parents and data from discordant sibships. We discuss the power of the test by presenting sample-size calculations involving a complex disease model, and we present formulas for the asymptotic relative efficiency (which is approximately the ratio of sample sizes) between SDT and the transmission/disequilibrium test (TDT) for special family structures. For sib pairs, we compare the SDT to a test proposed both by Curtis and, independently, by Spielman and Ewens. We show that, for discordant sib pairs, the SDT has good power for testing linkage disequilibrium relative both to Curtis''s tests and to the TDT using trios comprising an affected sib and its parents. With additional sibs, we show that the SDT can be more powerful than the TDT for testing linkage disequilibrium, especially for disease prevalence >.3.  相似文献   

14.
Wijsman EM 《Human genetics》2012,131(10):1555-1563
Rare variation is the current frontier in human genetics. The large pedigree design is practical, efficient, and well-suited for investigating rare variation. In large pedigrees, specific rare variants that co-segregate with a trait will occur in sufficient numbers so that effects can be measured, and evidence for association can be evaluated, by making use of methods that fully use the pedigree information. Evidence from linkage analysis can focus investigation, both reducing the multiple testing burden and expanding the variants that can be evaluated and followed up, as recent studies have shown. The large pedigree design requires only a small fraction of the sample size needed to identify rare variants of interest in population-based designs, and many highly suitable, well-understood, and available statistical and computational tools already exist. Samples consisting of large pedigrees with existing rich phenotype and genome scan data should be prime candidates for high-throughput sequencing in the search of the determinants of complex traits.  相似文献   

15.
Haseman and Elston (H-E) proposed a robust test to detect linkage between a quantitative trait and a genetic marker. In their method the squared sib-pair trait difference is regressed on the estimated proportion of alleles at a locus shared identical by descent by sib pairs. This method has recently been improved by changing the dependent variable from the squared difference to the mean-corrected product of the sib-pair trait values, a significantly positive regression indicating linkage. Because situations arise in which the original test is more powerful, a further improvement of the H-E method occurs when the dependent variable is changed to a weighted average of the squared sib-pair trait difference and the squared sib-pair mean-corrected trait sum. Here we propose an optimal method of performing this weighting for larger sibships, allowing for the correlation between pairs within a sibship. The optimal weights are inversely proportional to the residual variances obtained from the two different regressions based on the squared sib-pair trait differences and the squared sib-pair mean-corrected trait sums, respectively, allowing for correlations among sib pairs. The proposed method is compared with the existing extension of the H-E approach for larger sibships. Control of the type I error probabilities for sibships of any size can be improved by using a generalized estimating equation approach and the robust sandwich estimate of the variance, or a Monte-Carlo permutation test.  相似文献   

16.
Rare variants affecting phenotype pose a unique challenge for human genetics. Although genome-wide association studies have successfully detected many common causal variants, they are underpowered in identifying disease variants that are too rare or population-specific to be imputed from a general reference panel and thus are poorly represented on commercial SNP arrays. We set out to overcome these challenges and detect association between disease and rare alleles using SNP arrays by relying on long stretches of genomic sharing that are identical by descent. We have developed an algorithm, DASH, which builds upon pairwise identical-by-descent shared segments to infer clusters of individuals likely to be sharing a single haplotype. DASH constructs a graph with nodes representing individuals and links on the basis of such segments spanning a locus and uses an iterative minimum cut algorithm to identify densely connected components. We have applied DASH to simulated data and diverse GWAS data sets by constructing haplotype clusters and testing them for association. In simulations we show this approach to be significantly more powerful than single-marker testing in an isolated population that is from Kosrae, Federated States of Micronesia and has abundant IBD, and we provide orthogonal information for rare, recent variants in the outbred Wellcome Trust Case-Control Consortium (WTCCC) data. In both cohorts, we identified a number of haplotype associations, five such loci in the WTCCC data and ten in the isolated, that were conditionally significant beyond any individual nearby markers. We have replicated one of these loci in an independent European cohort and identified putative structural changes in low-pass whole-genome sequence of the cluster carriers.  相似文献   

17.
Human genetics researchers have been intrigued for many years by weak-to-moderate associations between markers and diseases. However, in most cases of association, the cause of this phenomenon is still not known. Recently, interest has grown in pursuing association studies for complex diseases, either instead of or in addition to linkage studies. Hence, it is timely to reconsider what a disease-marker association, particularly in the weak-to-moderate range (relative risk < 10), can tell us about disease etiology. To this end, this study accomplishes three aims: (1) It formulates two different models explaining weak-to-moderate associations and derives the relationship between them. One is a linkage disequilibrium model, and the other is a "susceptibility," or pure association, model. The importance of drawing the distinction between these two models and the implications for our understanding of the genetics of human disease will also be discussed. It will be argued that the linkage disequilibrium model represents true linkage but that the susceptibility model does not. (2) It examines two family-based association tests proposed recently by Parsian et al. and Spielman et al. and derives formulas for their behavior under the two models described above. It demonstrates that these tests yield almost identical results under these two models. It shows that, whereas these tests can confirm an association, they cannot determine whether the association is caused by the linkage disequilibrium model or the susceptibility model. The study also characterizes the probabilities yielded by the family association tests in the presence of weak-to-moderate associations, which will aid researchers using these tests. (3) It proposes two approaches, both based on linkage analysis, which can distinguish between the two models described above. One approach involves a straightforward linkage analysis of the data; the other involves a partitioned association-linkage (PAL) test, as suggested by Greenberg. Formulas are derived for testing identity by descent in affected sib pairs by using both approaches. (4) Finally, the formulas and arguments are illustrated with two examples from the literature and one computer-simulated data set.  相似文献   

18.
In disease studies, family-based designs have become an attractive approach to analyzing next-generation sequencing (NGS) data for the identification of rare mutations enriched in families. Substantial research effort has been devoted to developing pipelines for automating sequence alignment, variant calling, and annotation. However, fewer pipelines have been designed specifically for disease studies. Most of the current analysis pipelines for family-based disease studies using NGS data focus on a specific function, such as identifying variants with Mendelian inheritance or identifying shared chromosomal regions among affected family members. Consequently, some other useful family-based analysis tools, such as imputation, linkage, and association tools, have yet to be integrated and automated. We developed FamPipe, a comprehensive analysis pipeline, which includes several family-specific analysis modules, including the identification of shared chromosomal regions among affected family members, prioritizing variants assuming a disease model, imputation of untyped variants, and linkage and association tests. We used simulation studies to compare properties of some modules implemented in FamPipe, and based on the results, we provided suggestions for the selection of modules to achieve an optimal analysis strategy. The pipeline is under the GNU GPL License and can be downloaded for free at http://fampipe.sourceforge.net.
This is a PLOS Computational Biology Software article.
  相似文献   

19.
Summary We investigated possible association of and linkage between HLA and familial polyposis coli (FPC). In 182 individuals from 66 pedigrees of FPC and 108 individuals from a normal population, HLA-A,-B, and-C antigens were determined. When the frequencies of HLA antigens in 66 unrelated patients and in normal controls were compared, no association of FPC with HLA was observed. For the linkage analysis, HLA haplotypes of 17 affected sib pairs were investigated by the affected sib pair method. The number of pairs which shared two, one, and no haplotypes identical by descent was not significantly different from the number expected with random occurrence (P>0.95). Finally, seven families were analyzed using Morton's sequential test. A maximum lod score of-0.056 at a recombination fraction of 0.4, and a lod of-3.089 at a recombination fraction of 0.05 were obtained. Therefore, there is neither an association of nor linkage between FPC and HLA.  相似文献   

20.
Exome sequencing offers the potential to study the population-genomic variables that underlie patterns of deleterious variation. Runs of homozygosity (ROH) are long stretches of consecutive homozygous genotypes probably reflecting segments shared identically by descent as the result of processes such as consanguinity, population size reduction, and natural selection. The relationship between ROH and patterns of predicted deleterious variation can provide insight into the way in which these processes contribute to the maintenance of deleterious variants. Here, we use exome sequencing to examine ROH in relation to the distribution of deleterious variation in 27 individuals of varying levels of apparent inbreeding from 6 human populations. A significantly greater fraction of all genome-wide predicted damaging homozygotes fall in ROH than would be expected from the corresponding fraction of nondamaging homozygotes in ROH (p < 0.001). This pattern is strongest for long ROH (p < 0.05). ROH, and especially long ROH, harbor disproportionately more deleterious homozygotes than would be expected on the basis of the total ROH coverage of the genome and the genomic distribution of nondamaging homozygotes. The results accord with a hypothesis that recent inbreeding, which generates long ROH, enables rare deleterious variants to exist in homozygous form. Thus, just as inbreeding can elevate the occurrence of rare recessive diseases that represent homozygotes for strongly deleterious mutations, inbreeding magnifies the occurrence of mildly deleterious variants as well.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号