首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 672 毫秒
1.
Genome and exome sequencing yield extensive catalogues of human genetic variation. However, pinpointing the few phenotypically causal variants among the many variants present in human genomes remains a major challenge, particularly for rare and complex traits wherein genetic information alone is often insufficient. Here, we review approaches to estimate the deleteriousness of single nucleotide variants (SNVs), which can be used to prioritize disease-causal variants. We describe recent advances in comparative and functional genomics that enable systematic annotation of both coding and non-coding variants. Application and optimization of these methods will be essential to find the genetic answers that sequencing promises to hide in plain sight.  相似文献   

2.
Sensitivity to pain varies considerably between individuals and is known to be heritable. Increased sensitivity to experimental pain is a risk factor for developing chronic pain, a common and debilitating but poorly understood symptom. To understand mechanisms underlying pain sensitivity and to search for rare gene variants (MAF<5%) influencing pain sensitivity, we explored the genetic variation in individuals'' responses to experimental pain. Quantitative sensory testing to heat pain was performed in 2,500 volunteers from TwinsUK (TUK): exome sequencing to a depth of 70× was carried out on DNA from singletons at the high and low ends of the heat pain sensitivity distribution in two separate subsamples. Thus in TUK1, 101 pain-sensitive and 102 pain-insensitive were examined, while in TUK2 there were 114 and 96 individuals respectively. A combination of methods was used to test the association between rare variants and pain sensitivity, and the function of the genes identified was explored using network analysis. Using causal reasoning analysis on the genes with different patterns of SNVs by pain sensitivity status, we observed a significant enrichment of variants in genes of the angiotensin pathway (Bonferroni corrected p = 3.8×10−4). This pathway is already implicated in animal models and human studies of pain, supporting the notion that it may provide fruitful new targets in pain management. The approach of sequencing extreme exome variation in normal individuals has provided important insights into gene networks mediating pain sensitivity in humans and will be applicable to other common complex traits.  相似文献   

3.
Mutation position imaging toolbox (MuPIT) interactive is a browser-based application for single-nucleotide variants (SNVs), which automatically maps the genomic coordinates of SNVs onto the coordinates of available three-dimensional (3D) protein structures. The application is designed for interactive browser-based visualization of the putative functional relevance of SNVs by biologists who are not necessarily experts either in bioinformatics or protein structure. Users may submit batches of several thousand SNVs and review all protein structures that cover the SNVs, including available functional annotations such as binding sites, mutagenesis experiments, and common polymorphisms. Multiple SNVs may be mapped onto each structure, enabling 3D visualization of SNV clusters and their relationship to functionally annotated positions. We illustrate the utility of MuPIT interactive in rationalizing the impact of selected polymorphisms in the PharmGKB database, somatic mutations identified in the Cancer Genome Atlas study of invasive breast carcinomas, and rare variants identified in the exome sequencing project. MuPIT interactive is freely available for non-profit use at http://mupit.icm.jhu.edu.  相似文献   

4.
《PloS one》2014,9(8)
Asthma is a complex genetic disease caused by a combination of genetic and environmental risk factors. We sought to test classes of genetic variants largely missed by genome-wide association studies (GWAS), including copy number variants (CNVs) and low-frequency variants, by performing whole-genome sequencing (WGS) on 16 individuals from asthma-enriched and asthma-depleted families. The samples were obtained from an extended 13-generation Hutterite pedigree with reduced genetic heterogeneity due to a small founding gene pool and reduced environmental heterogeneity as a result of a communal lifestyle. We sequenced each individual to an average depth of 13-fold, generated a comprehensive catalog of genetic variants, and tested the most severe mutations for association with asthma. We identified and validated 1960 CNVs, 19 nonsense or splice-site single nucleotide variants (SNVs), and 18 insertions or deletions that were out of frame. As follow-up, we performed targeted sequencing of 16 genes in 837 cases and 540 controls of Puerto Rican ancestry and found that controls carry a significantly higher burden of mutations in IL27RA (2.0% of controls; 0.23% of cases; nominal p = 0.004; Bonferroni p = 0.21). We also genotyped 593 CNVs in 1199 Hutterite individuals. We identified a nominally significant association (p = 0.03; Odds ratio (OR) = 3.13) between a 6 kbp deletion in an intron of NEDD4L and increased risk of asthma. We genotyped this deletion in an additional 4787 non-Hutterite individuals (nominal p = 0.056; OR = 1.69). NEDD4L is expressed in bronchial epithelial cells, and conditional knockout of this gene in the lung in mice leads to severe inflammation and mucus accumulation. Our study represents one of the early instances of applying WGS to complex disease with a large environmental component and demonstrates how WGS can identify risk variants, including CNVs and low-frequency variants, largely untested in GWAS.  相似文献   

5.
The use of post-alignment procedures has been suggested to prevent the identification of false-positives in massive DNA sequencing data. Insertions and deletions are most likely to be misinterpreted by variant calling algorithms. Using known genetic variants as references for post-processing pipelines can minimize mismatches. They allow reads to be correctly realigned and recalibrated, resulting in more parsimonious variant calling. In this work, we aim to investigate the impact of using different sets of common variants as references to facilitate variant calling from whole-exome sequencing data. We selected reference variants from common insertions and deletions available within the 1K Genomes project data and from databases from the Latin American Database of Genetic Variation (LatinGen). We used the Genome Analysis Toolkit to perform post-processing procedures like local realignment, quality recalibration procedures, and variant calling in whole exome samples. We identified an increased number of variants from the call set for all groups when no post-processing procedure was performed. We found that there was a higher concordance rate between variants called using 1K Genomes and LatinGen. Therefore, we believe that the increased number of rare variants identified in the analysis without realignment or quality recalibration indicated that they were likely false-positives.  相似文献   

6.
A dozen genes/regions have been confirmed as genetic risk factors for oral clefts in human association and linkage studies, and animal models argue even more genes may be involved. Genomic sequencing studies should identify specific causal variants and may reveal additional genes as influencing risk to oral clefts, which have a complex and heterogeneous etiology. We conducted a whole exome sequencing (WES) study to search for potentially causal variants using affected relatives drawn from multiplex cleft families. Two or three affected second, third, and higher degree relatives from 55 multiplex families were sequenced. We examined rare single nucleotide variants (SNVs) shared by affected relatives in 348 recognized candidate genes. Exact probabilities that affected relatives would share these rare variants were calculated, given pedigree structures, and corrected for the number of variants tested. Five novel and potentially damaging SNVs shared by affected distant relatives were found and confirmed by Sanger sequencing. One damaging SNV in CDH1, shared by three affected second cousins from a single family, attained statistical significance (P = 0.02 after correcting for multiple tests). Family-based designs such as the one used in this WES study offer important advantages for identifying genes likely to be causing complex and heterogeneous disorders.  相似文献   

7.
Autism spectrum disorders (ASD) are a heterogeneous group of neurodevelopmental disorders with a complex inheritance pattern. While many rare variants in synaptic proteins have been identified in patients with ASD, little is known about their effects at the synapse and their interactions with other genetic variations. Here, following the discovery of two de novo SHANK2 deletions by the Autism Genome Project, we identified a novel 421 kb de novo SHANK2 deletion in a patient with autism. We then sequenced SHANK2 in 455 patients with ASD and 431 controls and integrated these results with those reported by Berkel et al. 2010 (n = 396 patients and n = 659 controls). We observed a significant enrichment of variants affecting conserved amino acids in 29 of 851 (3.4%) patients and in 16 of 1,090 (1.5%) controls (P = 0.004, OR = 2.37, 95% CI = 1.23–4.70). In neuronal cell cultures, the variants identified in patients were associated with a reduced synaptic density at dendrites compared to the variants only detected in controls (P = 0.0013). Interestingly, the three patients with de novo SHANK2 deletions also carried inherited CNVs at 15q11–q13 previously associated with neuropsychiatric disorders. In two cases, the nicotinic receptor CHRNA7 was duplicated and in one case the synaptic translation repressor CYFIP1 was deleted. These results strengthen the role of synaptic gene dysfunction in ASD but also highlight the presence of putative modifier genes, which is in keeping with the “multiple hit model” for ASD. A better knowledge of these genetic interactions will be necessary to understand the complex inheritance pattern of ASD.  相似文献   

8.
Although rare variants within the Toll-like receptor signalling pathway genes have been found to underlie human primary immunodeficiencies associated with selective predisposition to invasive pneumococcal disease (IPD), the contribution of variants in these genes to IPD susceptibility at the population level remains unknown. Complete re-sequencing of IRAK4, MYD88 and IKBKG genes was undertaken in 164 IPD cases from the UK and 164 geographically-matched population-based controls. 233 single-nucleotide variants (SNVs) were identified, of which ten were in coding regions. Four rare coding variants were predicted to be deleterious, two variants in MYD88 and two in IRAK4. The predicted deleterious variants in MYD88 were observed as two heterozygote cases but not seen in controls. Frequencies of predicted deleterious IRAK4 SNVs were the same in cases and controls. Our findings suggest that rare, functional variants in MYD88, IRAK4 or IKBKG do not significantly contribute to IPD susceptibility in adults at the population level.  相似文献   

9.
Ku CS  Naidoo N  Pawitan Y 《Human genetics》2011,129(4):351-370
Over the past several years, more focus has been placed on dissecting the genetic basis of complex diseases and traits through genome-wide association studies. In contrast, Mendelian disorders have received little attention mainly due to the lack of newer and more powerful methods to study these disorders. Linkage studies have previously been the main tool to elucidate the genetics of Mendelian disorders; however, extremely rare disorders or sporadic cases caused by de novo variants are not amendable to this study design. Exome sequencing has now become technically feasible and more cost-effective due to the recent advances in high-throughput sequence capture methods and next-generation sequencing technologies which have offered new opportunities for Mendelian disorder research. Exome sequencing has been swiftly applied to the discovery of new causal variants and candidate genes for a number of Mendelian disorders such as Kabuki syndrome, Miller syndrome and Fowler syndrome. In addition, de novo variants were also identified for sporadic cases, which would have not been possible without exome sequencing. Although exome sequencing has been proven to be a promising approach to study Mendelian disorders, several shortcomings of this method must be noted, such as the inability to capture regulatory or evolutionary conserved sequences in non-coding regions and the incomplete capturing of all exons.  相似文献   

10.
Both schizophrenia (SCZ) and autism spectrum disorders (ASD) are neuropsychiatric disorders with overlapping genetic etiology. Protocadherin 15 (PCDH15), which encodes a member of the cadherin super family that contributes to neural development and function, has been cited as a risk gene for neuropsychiatric disorders. Recently, rare variants of large effect have been paid attention to understand the etiopathology of these complex disorders. Thus, we evaluated the impacts of rare, single-nucleotide variants (SNVs) in PCDH15 on SCZ or ASD. First, we conducted coding exon-targeted resequencing of PCDH15 with next-generation sequencing technology in 562 Japanese patients (370 SCZ and 192 ASD) and detected 16 heterozygous SNVs. We then performed association analyses on 2,096 cases (1,714 SCZ and 382 ASD) and 1,917 controls with six novel variants of these 16 SNVs. Of these six variants, four (p.R219K, p.T281A, p.D642N, c.3010-1G>C) were ultra-rare variants (minor allele frequency < 0.0005) that may increase disease susceptibility. Finally, no statistically significant association between any of these rare, heterozygous PCDH15 point variants and SCZ or ASD was found. Our results suggest that a larger sample size of resequencing subjects is necessary to detect associations between rare PCDH15 variants and neuropsychiatric disorders.  相似文献   

11.
12.
The advent of next-generation sequencing has facilitated large-scale discovery, validation and assessment of genetic markers for high density genotyping. The present study was undertaken to identify markers in genes supposedly related to wood property traits in three Eucalyptus species. Ninety four genes involved in xylogenesis were selected for hybridization probe based nuclear genomic DNA target enrichment and exome sequencing. Genomic DNA was isolated from the leaf tissues and used for on-array probe hybridization followed by Illumina sequencing. The raw sequence reads were trimmed and high-quality reads were mapped to the E. grandis reference sequence and the presence of single nucleotide variants (SNVs) and insertions/ deletions (InDels) were identified across the three species. The average read coverage was 216X and a total of 2294 SNVs and 479 InDels were discovered in E. camaldulensis, 2383 SNVs and 518 InDels in E. tereticornis, and 1228 SNVs and 409 InDels in E. grandis. Additionally, SNV calling and InDel detection were conducted in pair-wise comparisons of E. tereticornis vs. E. grandis, E. camaldulensis vs. E. tereticornis and E. camaldulensis vs. E. grandis. This study presents an efficient and high throughput method on development of genetic markers for family– based QTL and association analysis in Eucalyptus.  相似文献   

13.
以单核苷酸多态性(Single-nucleotide polymorphism, SNP)为遗传标记, 采用全基因组关联研究(Genome-wide association studies, GWAS)的策略, 已经在660多种疾病(或性状)中发现了3800多个遗传易感基因区域。但是, 其中最显著关联的遗传变异或致病性的遗传变异位点及其生物学功能并不完全清楚。这些位点的鉴定有助于阐明复杂疾病的生物学机制, 以及发现新的疾病标记物。后GWAS时代的主要任务之一就是通过精细定位研究找到复杂疾病易感基因区域内最显著关联的易感位点或致病性的易感位点并阐明其生物学功能。针对常见变异, 可通过推断或重测序增加SNP密度, 寻找最显著关联的SNP位点, 并通过功能元件分析、表达数量性状位点(Expression quantitative trait locus, eQTL)分析和单体型分析等方法寻找功能性的SNP位点和易感基因。针对罕见变异, 则可采用重测序、罕见单体型分析、家系分析和负荷检验等方法进行精细定位。文章对这些策略和所面临的问题进行了综述。  相似文献   

14.
As next-generation sequencing projects generate massive genome-wide sequence variation data, bioinformatics tools are being developed to provide computational predictions on the functional effects of sequence variations and narrow down the search of casual variants for disease phenotypes. Different classes of sequence variations at the nucleotide level are involved in human diseases, including substitutions, insertions, deletions, frameshifts, and non-sense mutations. Frameshifts and non-sense mutations are likely to cause a negative effect on protein function. Existing prediction tools primarily focus on studying the deleterious effects of single amino acid substitutions through examining amino acid conservation at the position of interest among related sequences, an approach that is not directly applicable to insertions or deletions. Here, we introduce a versatile alignment-based score as a new metric to predict the damaging effects of variations not limited to single amino acid substitutions but also in-frame insertions, deletions, and multiple amino acid substitutions. This alignment-based score measures the change in sequence similarity of a query sequence to a protein sequence homolog before and after the introduction of an amino acid variation to the query sequence. Our results showed that the scoring scheme performs well in separating disease-associated variants (n = 21,662) from common polymorphisms (n = 37,022) for UniProt human protein variations, and also in separating deleterious variants (n = 15,179) from neutral variants (n = 17,891) for UniProt non-human protein variations. In our approach, the area under the receiver operating characteristic curve (AUC) for the human and non-human protein variation datasets is ∼0.85. We also observed that the alignment-based score correlates with the deleteriousness of a sequence variation. In summary, we have developed a new algorithm, PROVEAN (Protein Variation Effect Analyzer), which provides a generalized approach to predict the functional effects of protein sequence variations including single or multiple amino acid substitutions, and in-frame insertions and deletions. The PROVEAN tool is available online at http://provean.jcvi.org.  相似文献   

15.
16.
Recent advances in DNA sequencing techniques have identified rare single‐nucleotide variants with less than 1% minor allele frequency. Despite the growing interest and physiological importance of rare variants in genome sciences, less attention has been paid to the allele frequency of variants in protein sciences. To elucidate the characteristics of genetic variants on protein interaction sites, from the viewpoints of the allele frequency and the structural position of variants, we mapped about 20,000 human SNVs onto protein complexes. We found that variants are less abundant in protein interfaces, and specifically the core regions of interfaces. The tendency to “avoid” the interfacial core is stronger among common variants than rare variants. As amino acid substitutions, the trend of mutating amino acids among rare variants is consistent in different interfacial regions, reflecting the fact that rare variants result from random mutations in DNA sequences, whereas amino acid changes of common variants vary between the interfacial core and rim regions, possibly due to functional constraints on proteins. This study illustrated how the allele frequency of variants relates to the protein structural regions and the functional sites in general and will lead to deeper understanding of the potential deleteriousness of rare variants at the structural level. Exceptional cases of the observed trends will shed light on the limitations of structural approaches to evaluate the functional impacts of variants.  相似文献   

17.
Hypercholesterolemia has strong heritability and about 40–60% of hypercholesterolemia is caused by genetic risk factors. A number of monogenic genes have been identified so far for familial hypercholesterolemia (FH). However, in the general population, more than 90% of individuals with LDL cholesterol over 190 mg/dL do not carry known FH mutations. Large scale whole-exome sequencing has identified thousands of variants that are predicted to be loss-of-function (LoF) and each individual has a median of about twenty rare LoF variants and several hundreds more common LoF variants. However, majority of those variants have not been characterized and their functional consequence remains largely unknown. Rs77542162 is a common missense variant in ABCA6 and is strongly associated with hypercholesterolemia in different populations. ABCA6 is a cholesterol responsive gene and has been suggested to play a role in lipid metabolism. However, whether and how rs77542162 and ABCA6 regulate lipoprotein metabolism remain unknown. In current study, we systemically characterized the function of rs77542162 and ABCA6 in cultured cells and in vivo of rodents. We found that Abca6 is specifically expressed on the basolateral surface of hepatocytes in mouse liver. The rs77542162 variant disrupts ABCA6 protein stability and results in loss of functional protein. However, we found no evidence that Abca6 plays a role in lipoprotein metabolism in either normal mice or hypercholesterolemia mice or hamsters. Thus, our results suggest that Abca6 does not regulate lipoprotein metabolism in rodents and highlight the challenge and importance of functional characterization of disease-associated variants in animal models.  相似文献   

18.
There is much interest in characterizing the variation in a human individual, because this may elucidate what contributes significantly to a person's phenotype, thereby enabling personalized genomics. We focus here on the variants in a person's 'exome,' which is the set of exons in a genome, because the exome is believed to harbor much of the functional variation. We provide an analysis of the approximately 12,500 variants that affect the protein coding portion of an individual's genome. We identified approximately 10,400 nonsynonymous single nucleotide polymorphisms (nsSNPs) in this individual, of which approximately 15-20% are rare in the human population. We predict approximately 1,500 nsSNPs affect protein function and these tend be heterozygous, rare, or novel. Of the approximately 700 coding indels, approximately half tend to have lengths that are a multiple of three, which causes insertions/deletions of amino acids in the corresponding protein, rather than introducing frameshifts. Coding indels also occur frequently at the termini of genes, so even if an indel causes a frameshift, an alternative start or stop site in the gene can still be used to make a functional protein. In summary, we reduced the set of approximately 12,500 nonsilent coding variants by approximately 8-fold to a set of variants that are most likely to have major effects on their proteins' functions. This is our first glimpse of an individual's exome and a snapshot of the current state of personalized genomics. The majority of coding variants in this individual are common and appear to be functionally neutral. Our results also indicate that some variants can be used to improve the current NCBI human reference genome. As more genomes are sequenced, many rare variants and non-SNP variants will be discovered. We present an approach to analyze the coding variation in humans by proposing multiple bioinformatic methods to hone in on possible functional variation.  相似文献   

19.
Calpainopathy-a survey of mutations and polymorphisms.   总被引:5,自引:0,他引:5       下载免费PDF全文
Limb-girdle muscular dystrophy type 2A (LGMD2A) is an autosomal recessive disorder characterized mainly by symmetrical and selective atrophy of the proximal limb muscles. It derives from defects in the human CAPN3 gene, which encodes the skeletal muscle-specific member of the calpain family. This report represents a compilation of the mutations and variants identified so far in this gene. To date, 97 distinct pathogenic calpain 3 mutations have been identified (4 nonsense mutations, 32 deletions/insertions, 8 splice-site mutations, and 53 missense mutations), 56 of which have not been described previously, together with 12 polymorphisms and 5 nonclassified variants. The mutations are distributed along the entire length of the CAPN3 gene. Thus far, most mutations identified represent private variants, although particular mutations have been found more frequently. Knowledge of the mutation spectrum occurring in the CAPN3 gene may contribute significantly to structure/function and pathogenesis studies. It may also help in the design of efficient mutation-screening strategies for calpainopathies.  相似文献   

20.
SLC6A15 is a neuron-specific neutral amino acid transporter that belongs to the solute carrier 6 gene family. This gene family is responsible for presynaptic re-uptake of the majority of neurotransmitters. Convergent data from human studies, animal models and pharmacological investigations suggest a possible role of SLC6A15 in major depressive disorder. In this work, we explored potential functional variants in this gene that could influence the activity of the amino acid transporter and thus downstream neuronal function and possibly the risk for stress-related psychiatric disorders. DNA from 400 depressed patients and 400 controls was screened for genetic variants using a pooled targeted re-sequencing approach. Results were verified by individual re-genotyping and validated non-synonymous coding variants were tested in an independent sample (N = 1934). Nine variants altering the amino acid sequence were then assessed for their functional effects by measuring SLC6A15 transporter activity in a cellular uptake assay. In total, we identified 405 genetic variants, including twelve non-synonymous variants. While none of the non-synonymous coding variants showed significant differences in case-control associations, two rare non-synonymous variants were associated with a significantly increased maximal 3H proline uptake as compared to the wildtype sequence. Our data suggest that genetic variants in the SLC6A15 locus change the activity of the amino acid transporter and might thus influence its neuronal function and the risk for stress-related psychiatric disorders. As statistically significant association for rare variants might only be achieved in extremely large samples (N >70,000) functional exploration may shed light on putatively disease-relevant variants.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号