首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Ren F  Tanaka H  Yang Z 《Gene》2009,441(1-2):119-125
Supermatrix and supertree methods are two strategies advocated for phylogenetic analysis of sequence data from multiple gene loci, especially when some species are missing at some loci. The supermatrix method concatenates sequences from multiple genes into a data supermatrix for phylogenetic analysis, and ignores differences in evolutionary dynamics among the genes. The supertree method analyzes each gene separately and assembles the subtrees estimated from individual genes into a supertree for all species. Most algorithms suggested for supertree construction lack statistical justifications and ignore uncertainties in the subtrees. Instead of supermatrix or supertree, we advocate the use of likelihood function to combine data from multiple genes while accommodating their differences in the evolutionary process. This combines the strengths of the supermatrix and supertree methods while avoiding their drawbacks. We conduct computer simulation to evaluate the performance of the supermatrix, supertree, and maximum likelihood methods applied to two phylogenetic problems: molecular-clock dating of species divergences and reconstruction of species phylogenies. The results confirm the theoretical superiority of the likelihood method. Supertree or separate analyses of data of multiple genes may be useful in revealing the characteristics of the evolutionary process of multiple gene loci, and the information may be used to formulate realistic models for combined analysis of all genes by likelihood.  相似文献   

2.
Genome-wide association studies (GWAS) are now used routinely to identify SNPs associated with complex human phenotypes. In several cases, multiple variants within a gene contribute independently to disease risk. Here we introduce a novel Gene-Wide Significance (GWiS) test that uses greedy Bayesian model selection to identify the independent effects within a gene, which are combined to generate a stronger statistical signal. Permutation tests provide p-values that correct for the number of independent tests genome-wide and within each genetic locus. When applied to a dataset comprising 2.5 million SNPs in up to 8,000 individuals measured for various electrocardiography (ECG) parameters, this method identifies more validated associations than conventional GWAS approaches. The method also provides, for the first time, systematic assessments of the number of independent effects within a gene and the fraction of disease-associated genes housing multiple independent effects, observed at 35%-50% of loci in our study. This method can be generalized to other study designs, retains power for low-frequency alleles, and provides gene-based p-values that are directly compatible for pathway-based meta-analysis.  相似文献   

3.
Unraveling the genetic background of economic traits is a major goal in modern animal genetics and breeding. Both candidate gene analysis and QTL mapping have previously been used for identifying genes and chromosome regions related to studied traits. However, most of these studies may be limited in their ability to fully consider how multiple genetic factors may influence a particular phenotype of interest. If possible, taking advantage of the combined effect of multiple genetic factors is expected to be more powerful than analyzing single sites, as the joint action of multiple loci within a gene or across multiple genes acting in the same gene set will likely have a greater influence on phenotypic variation. Thus, we proposed a pipeline of gene set analysis that utilized information from multiple loci to improve statistical power. We assessed the performance of this approach by both simulated and a real IGF1-FoxO pathway data set. The results showed that our new method can identify the association between genetic variation and phenotypic variation with higher statistical power and unravel the mechanisms of complex traits in a point of gene set. Additionally, the proposed pipeline is flexible to be extended to model complex genetic structures that include the interactions between different gene sets and between gene sets and environments.  相似文献   

4.
The standard approach for identifying gene networks is based on experimental perturbations of gene regulatory systems such as gene knock-out experiments, followed by a genome-wide profiling of differential gene expressions. However, this approach is significantly limited in that it is not possible to perturb more than one or two genes simultaneously to discover complex gene interactions or to distinguish between direct and indirect downstream regulations of the differentially-expressed genes. As an alternative, genetical genomics study has been proposed to treat naturally-occurring genetic variants as potential perturbants of gene regulatory system and to recover gene networks via analysis of population gene-expression and genotype data. Despite many advantages of genetical genomics data analysis, the computational challenge that the effects of multifactorial genetic perturbations should be decoded simultaneously from data has prevented a widespread application of genetical genomics analysis. In this article, we propose a statistical framework for learning gene networks that overcomes the limitations of experimental perturbation methods and addresses the challenges of genetical genomics analysis. We introduce a new statistical model, called a sparse conditional Gaussian graphical model, and describe an efficient learning algorithm that simultaneously decodes the perturbations of gene regulatory system by a large number of SNPs to identify a gene network along with expression quantitative trait loci (eQTLs) that perturb this network. While our statistical model captures direct genetic perturbations of gene network, by performing inference on the probabilistic graphical model, we obtain detailed characterizations of how the direct SNP perturbation effects propagate through the gene network to perturb other genes indirectly. We demonstrate our statistical method using HapMap-simulated and yeast eQTL datasets. In particular, the yeast gene network identified computationally by our method under SNP perturbations is well supported by the results from experimental perturbation studies related to DNA replication stress response.  相似文献   

5.
6.
R Abo  GD Jenkins  L Wang  BL Fridley 《PloS one》2012,7(8):e43301
Genetic variation underlying the regulation of mRNA gene expression in humans may provide key insights into the molecular mechanisms of human traits and complex diseases. Current statistical methods to map genetic variation associated with mRNA gene expression have typically applied standard linkage and/or association methods; however, when genome-wide SNP and mRNA expression data are available performing all pair wise comparisons is computationally burdensome and may not provide optimal power to detect associations. Consideration of different approaches to account for the high dimensionality and multiple testing issues may provide increased efficiency and statistical power. Here we present a novel approach to model and test the association between genetic variation and mRNA gene expression levels in the context of gene sets (GSs) and pathways, referred to as gene set - expression quantitative trait loci analysis (GS-eQTL). The method uses GSs to initially group SNPs and mRNA expression, followed by the application of principal components analysis (PCA) to collapse the variation and reduce the dimensionality within the GSs. We applied GS-eQTL to assess the association between SNP and mRNA expression level data collected from a cell-based model system using PharmGKB and KEGG defined GSs. We observed a large number of significant GS-eQTL associations, in which the most significant associations arose between genetic variation and mRNA expression from the same GS. However, a number of associations involving genetic variation and mRNA expression from different GSs were also identified. Our proposed GS-eQTL method effectively addresses the multiple testing limitations in eQTL studies and provides biological context for SNP-expression associations.  相似文献   

7.
Li C  Zhang G  Li X  Rao S  Gong B  Jiang W  Hao D  Wu P  Wu C  Du L  Xiao Y  Wang Y 《Gene》2008,408(1-2):104-111
The advent of high-throughput single nucleotide polymorphisms (SNPs) omics technologies has brought tremendous genetic data. Systematic evaluation of the genome-wide SNPs is expected to provide breakthroughs in the understanding of complex diseases. In this study, we developed a new systematic method for mapping multiple loci and applied the proposed method to construct a genetic network for rheumatoid arthritis (RA) via analysis of 746 multiplex families genotyped with more than five thousands of genome-wide SNPs. We successfully identified 41 significant SNPs relevant to RA, 25 associated genes and a number of important SNP-SNP interactions (SNP patterns). Many findings (loci, genes and interactions) have experimental support from previous studies while novel findings may define unknown genetic pathways for this complex disease. Finally, we constructed a genetic network by integrating the results from this analysis with the rapidly accumulated knowledge in biomedical domains, which gave us a more detailed insight onto the RA etiology. The results suggest that the proposed systematic method is powerful when applied to genome-wide association studies. Integrating the analysis of high-throughput SNP data with knowledge-based SNP functional annotation offers a promising way to reversely engineer the underlying genetic networks for complex human diseases.  相似文献   

8.
Epistasis, or gene–gene interaction, results from joint effects of genes on a trait; thus, the same alleles of one gene may display different genetic effects in different genetic backgrounds. In this study, we generalized the coding technique of a natural and orthogonal interaction (NOIA) model for association studies along with gene–gene interactions for dichotomous traits and human complex diseases. The NOIA model which has non-correlated estimators for genetic effects is important for estimating influence from multiple loci. We conducted simulations and data analyses to evaluate the performance of the NOIA model. Both simulation and real data analyses revealed that the NOIA statistical model had higher power for detecting main genetic effects and usually had higher power for some interaction effects than the usual model. Although associated genes have been identified for predisposing people to melanoma risk: HERC2 at 15q13.1, MC1R at 16q24.3 and CDKN2A at 9p21.3, no gene–gene interaction study has been fully explored for melanoma. By applying the NOIA statistical model to a genome-wide melanoma dataset, we confirmed the previously identified significantly associated genes and found potential regions at chromosomes 5 and 4 that may interact with the HERC2 and MC1R genes, respectively. Our study not only generalized the orthogonal NOIA model but also provided useful insights for understanding the influence of interactions on melanoma risk.  相似文献   

9.
Genome-wide polymorphisms show unexpected targets of natural selection   总被引:1,自引:0,他引:1  
Natural selection can act on all the expressed genes of an individual, leaving signatures of genetic differentiation or diversity at many loci across the genome. New power to assay these genome-wide effects of selection comes from associating multi-locus patterns of polymorphism with gene expression and function. Here, we performed one of the first genome-wide surveys in a marine species, comparing purple sea urchins, Strongylocentrotus purpuratus, from two distant locations along the species' wide latitudinal range. We examined 9112 polymorphic loci from upstream non-coding and coding regions of genes for signatures of selection with respect to gene function and tissue- and ontogenetic gene expression. We found that genetic differentiation (F(ST)) varied significantly across functional gene classes. The strongest enrichment occurred in the upstream regions of E3 ligase genes, enzymes known to regulate protein abundance during development and environmental stress. We found enrichment for high heterozygosity in genes directly involved in immune response, particularly NALP genes, which mediate pro-inflammatory signals during bacterial infection. We also found higher heterozygosity in immune genes in the southern population, where disease incidence and pathogen diversity are greater. Similar to the major histocompatibility complex in mammals, balancing selection may enhance genetic diversity in the innate immune system genes of this invertebrate. Overall, our results show that how genome-wide polymorphism data coupled with growing databases on gene function and expression can combine to detect otherwise hidden signals of selection in natural populations.  相似文献   

10.
Marinov M  Weeks DE 《Human heredity》2001,51(3):169-176
As the focus of genome-wide scans for disease loci have shifted from simple Mendelian traits to genetically complex traits, researchers have begun to consider new alternative ways to detect linkage that will consider more than the marginal effects of a single disease locus at a time. One interesting new method is to train a neural network on a genome-wide data set in order to search for the best non-linear relationship between identity-by-descent sharing among affected siblings at markers and their disease status. We investigate here the repeatability of the neural network results from run to run, and show that the results obtained by multiple runs of the neural network method may differ quite a bit. This is most likely due to the fact that training a neural network involves minimizing an error function with a multitude of local minima.  相似文献   

11.
12.
Anorectal malformations (ARMs) are birth defects that require surgery and carry significant chronic morbidity. Our earlier genome-wide copy number variation (CNV) study had provided a wealth of candidate loci. To find out whether these candidate loci are related to important developmental pathways, we have performed an extensive literature search coupled with the currently available bioinformatics tools. This has allowed us to assign both genic and non-genic CNVs to interrelated pathways known to govern the development of the anorectal region. We have linked 11 candidate genes to the WNT signalling pathway and 17 genes to the cytoskeletal network. Interestingly, candidate genes with similar functions are disrupted by the same type of CNV. The gene network we discovered provides evidence that rare mutations in different interrelated genes may lead to similar phenotypes, accounting for genetic heterogeneity in ARMs. Classification of patients according to the affected pathway and lesion type should eventually improve the diagnosis and the identification of common genes/molecules as therapeutic targets.  相似文献   

13.
The hippocampus is critical for a wide range of emotional and cognitive behaviors. Here, we performed the first genome-wide search for genes influencing hippocampal oscillations. We measured local field potentials (LFPs) using 64-channel multi-electrode arrays in acute hippocampal slices of 29 BXD recombinant inbred mouse strains. Spontaneous activity and carbachol-induced fast network oscillations were analyzed with spectral and cross-correlation methods and the resulting traits were used for mapping quantitative trait loci (QTLs), i.e., regions on the genome that may influence hippocampal function. Using genome-wide hippocampal gene expression data, we narrowed the QTLs to eight candidate genes, including Plcb1, a phospholipase that is known to influence hippocampal oscillations. We also identified two genes coding for calcium channels, Cacna1b and Cacna1e, which mediate presynaptic transmitter release and have not been shown to regulate hippocampal network activity previously. Furthermore, we showed that the amplitude of the hippocampal oscillations is genetically correlated with hippocampal volume and several measures of novel environment exploration.  相似文献   

14.
15.
16.
Preterm birth in the United States is now 12%. Multiple genes, gene networks, and variants have been associated with this disease. Using a custom database for preterm birth (dbPTB) with a refined set of genes extensively curated from literature and biological databases, we analyzed GWAS of preterm birth for complete genotype data on nearly 2000 preterm and term mothers. We used both the curated genes and a genome-wide approach to carry out a pathway-based analysis. There were 19 significant pathways, which withstood FDR correction for multiple testing that were identified using both the curated genes and the genome-wide approach. The analysis based on the curated genes was more significant than genome-wide in 15 out of 19 pathways. This approach demonstrates the use of a validated set of genes, in the analysis of otherwise unsuccessful GWAS data, to identify gene–gene interactions in a way that enhances statistical power and discovery.  相似文献   

17.
18.
Ryman N  Jorde PE 《Molecular ecology》2001,10(10):2361-2373
A variety of statistical procedures are commonly employed when testing for genetic differentiation. In a typical situation two or more samples of individuals have been genotyped at several gene loci by molecular or biochemical means, and in a first step a statistical test for allele frequency homogeneity is performed at each locus separately, using, e.g. the contingency chi-square test, Fisher's exact test, or some modification thereof. In a second step the results from the separate tests are combined for evaluation of the joint null hypothesis that there is no allele frequency difference at any locus, corresponding to the important case where the samples would be regarded as drawn from the same statistical and, hence, biological population. Presently, there are two conceptually different strategies in use for testing the joint null hypothesis of no difference at any locus. One approach is based on the summation of chi-square statistics over loci. Another method is employed by investigators applying the Bonferroni technique (adjusting the P-value required for rejection to account for the elevated alpha errors when performing multiple tests simultaneously) to test if the heterogeneity observed at any particular locus can be regarded significant when considered separately. Under this approach the joint null hypothesis is rejected if one or more of the component single locus tests is considered significant under the Bonferroni criterion. We used computer simulations to evaluate the statistical power and realized alpha errors of these strategies when evaluating the joint hypothesis after scoring multiple loci. We find that the 'extended' Bonferroni approach generally is associated with low statistical power and should not be applied in the current setting. Further, and contrary to what might be expected, we find that 'exact' tests typically behave poorly when combined in existing procedures for joint hypothesis testing. Thus, while exact tests are generally to be preferred over approximate ones when testing each particular locus, approximate tests such as the traditional chi-square seem preferable when addressing the joint hypothesis.  相似文献   

19.
Many common human traits are believed to be a composite reflection of multiple genetic and non-genetic factors and the genetic contribution is consequently often difficult to characterise. Recent advances suggest that subtle variation in the regulation of gene expression may contribute to complex human traits. In two reports, Cheung and colleagues scale up human genetics analysis to an impressive level in a genome-wide search for the regulators of gene expression. They perform linkage analysis on expression profiles for over 3,500 genes and then employ the HapMap resource to take positive findings through to association studies at the genome-wide level. This work gives new insights into the complexities of gene regulation and the plausibility of genome-wide study design.  相似文献   

20.
Zhang F  Guo X  Deng HW 《PloS one》2011,6(2):e16739
Because of combining the genetic information of multiple loci, multilocus association studies (MLAS) are expected to be more powerful than single locus association studies (SLAS) in disease genes mapping. However, some researchers found that MLAS had similar or reduced power relative to SLAS, which was partly attributed to the increased degrees of freedom (dfs) in MLAS. Based on partial least-squares (PLS) analysis, we develop a MLAS approach, while avoiding large dfs in MLAS. In this approach, genotypes are first decomposed into the PLS components that not only capture majority of the genetic information of multiple loci, but also are relevant for target traits. The extracted PLS components are then regressed on target traits to detect association under multilinear regression. Simulation study based on real data from the HapMap project were used to assess the performance of our PLS-based MLAS as well as other popular multilinear regression-based MLAS approaches under various scenarios, considering genetic effects and linkage disequilibrium structure of candidate genetic regions. Using PLS-based MLAS approach, we conducted a genome-wide MLAS of lean body mass, and compared it with our previous genome-wide SLAS of lean body mass. Simulations and real data analyses results support the improved power of our PLS-based MLAS in disease genes mapping relative to other three MLAS approaches investigated in this study. We aim to provide an effective and powerful MLAS approach, which may help to overcome the limitations of SLAS in disease genes mapping.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号