首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Prioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic profile of genes with respect to their connection to disease phenotypes. The importance of protein-protein interaction networks in the genetic heterogeneity of common diseases or complex traits is becoming increasingly recognized. Thus, the development of a network-based approach combined with phenotypic profiling would be useful for disease gene prioritization.

Results

We developed a random-set scoring model and implemented it to quantify phenotype relevance in a network-based disease gene-prioritization approach. We validated our approach based on different gene phenotypic profiles, which were generated from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text-mining of the phenotype data. Our method demonstrated good precision and sensitivity compared with those of two alternative complex-based prioritization approaches. We then conducted a global ranking of all human genes according to their relevance to a range of human diseases. The resulting accurate ranking of known causal genes supported the reliability of our approach. Moreover, these data suggest many promising novel candidate genes for human disorders that have a complex mode of inheritance.

Conclusion

We have implemented and validated a network-based approach to prioritize genes for human diseases based on their phenotypic profile. We have devised a powerful and transparent tool to identify and rank candidate genes. Our global gene prioritization provides a unique resource for the biological interpretation of data from genome-wide association studies, and will help in the understanding of how the associated genetic variants influence disease or quantitative phenotypes.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-315) contains supplementary material, which is available to authorized users.  相似文献   

2.
Bacteria and Archaea display a variety of phenotypic traits and can adapt to diverse ecological niches. However, systematic annotation of prokaryotic phenotypes is lacking. We have therefore developed ProTraits, a resource containing ∼545 000 novel phenotype inferences, spanning 424 traits assigned to 3046 bacterial and archaeal species. These annotations were assigned by a computational pipeline that associates microbes with phenotypes by text-mining the scientific literature and the broader World Wide Web, while also being able to define novel concepts from unstructured text. Moreover, the ProTraits pipeline assigns phenotypes by drawing extensively on comparative genomics, capturing patterns in gene repertoires, codon usage biases, proteome composition and co-occurrence in metagenomes. Notably, we find that gene synteny is highly predictive of many phenotypes, and highlight examples of gene neighborhoods associated with spore-forming ability. A global analysis of trait interrelatedness outlined clusters in the microbial phenotype network, suggesting common genetic underpinnings. Our extended set of phenotype annotations allows detection of 57 088 high confidence gene-trait links, which recover many known associations involving sporulation, flagella, catalase activity, aerobicity, photosynthesis and other traits. Over 99% of the commonly occurring gene families are involved in genetic interactions conditional on at least one phenotype, suggesting that epistasis has a major role in shaping microbial gene content.  相似文献   

3.
Gene expression QTL (eQTL) mapping can suggest candidate regulatory relationships between genes. Recent advances in mammalian phenotype annotation such as mammalian phenotype ontology (MPO) enable systematic analysis of the phenotypic spectrum subserved by many genes. In this study we combined eQTL mapping and phenotypic spectrum analysis to predict gene regulatory relationships. Five pairs of genes with similar phenotypic effects and potential regulatory relationships suggested by eQTL mapping were identified. Lines of evidence supporting some of the predicted regulatory relationships were obtained from biological literature. A particularly notable example is that promoter sequence analysis and real-time PCR assays support the predicted regulation of protein kinase C epsilon (Prkce) by cAMP responsive element binding protein 1 (Creb1). Our results show that the combination of gene eQTL mapping and phenotypic spectrum analysis may provide a valuable approach to uncovering gene regulatory relations underlying mammalian phenotypes.  相似文献   

4.
Because some genes have been cloned that have a known biochemical or physiological function, genetic variation can be measured in a population at loci that may directly influence a phenotype of interest. With this measured genotype approach, specific alleles or haplotypes in the probed DNA region can be assigned phenotypic effects. In this paper we address several problems encountered in implementing the measured genotype approach with restriction site data. A number of analytical problems arise in part as a consequence of the linkage disequilibrium that is commonly encountered when dealing with small DNA regions: 1) different restriction site polymorphisms are not statistically independent, 2) the sites being measured are not likely to be the direct cause of the associated phenotypic effects, 3) haplotype classes may be phenotypically heterogeneous, and 4) the sites that are most strongly associated with phenotypic effects are not necessarily the most closely linked to the actual genetic cause of the effects. When recombination and gene conversion are rare, the primary cause of linkage disequilibrium is history (mutational origin, genetic drift, hitchhiking, etc.). We deal with historical association directly by producing a cladogram that partially reconstructs the evolutionary history of the present-day haplotype variability. The cladogram defines a nested analysis of variance that simultaneously detects phenotypic effects, localizes the effects within the cladogram, and identifies haplotypes that are potentially heterogeneous in their phenotypic associations. The power of this approach is illustrated by an analysis of the associations between alcohol dehydrogenase (ADH) activity and restriction site variability in a 13-kb fragment surrounding the ADH locus in Drosophila melanogaster.  相似文献   

5.
Forward Genomics – a comparative genomics approach to link phenotype to genotype Despite availability of several sequenced genomes, we know very little about the specific changes in the DNA that underlie phenotypic differences between species. The main reason is that species differ by both numerous genomic and phenotypic changes. A new comparative genomics method addresses this question by for phenotypes with independent evolutionary losses by searching for genomic regions that exhibit an elevated number of mutations in exactly these phenotype‐loss species. The near future sequencing of thousands of novel genomes will make it possible to use comparative genomics to systematically search for such DNA changes that are associated with phenotypic differences.  相似文献   

6.
Functional genomics screens using multi-parametric assays are powerful approaches for identifying genes involved in particular cellular processes. However, they suffer from problems like noise, and often provide little insight into molecular mechanisms. A bottleneck for addressing these issues is the lack of computational methods for the systematic integration of multi-parametric phenotypic datasets with molecular interactions. Here, we present Integrative Multi Profile Analysis of Cellular Traits (IMPACT). The main goal of IMPACT is to identify the most consistent phenotypic profile among interacting genes. This approach utilizes two types of external information: sets of related genes (IMPACT-sets) and network information (IMPACT-modules). Based on the notion that interacting genes are more likely to be involved in similar functions than non-interacting genes, this data is used as a prior to inform the filtering of phenotypic profiles that are similar among interacting genes. IMPACT-sets selects the most frequent profile among a set of related genes. IMPACT-modules identifies sub-networks containing genes with similar phenotype profiles. The statistical significance of these selections is subsequently quantified via permutations of the data. IMPACT (1) handles multiple profiles per gene, (2) rescues genes with weak phenotypes and (3) accounts for multiple biases e.g. caused by the network topology. Application to a genome-wide RNAi screen on endocytosis showed that IMPACT improved the recovery of known endocytosis-related genes, decreased off-target effects, and detected consistent phenotypes. Those findings were confirmed by rescreening 468 genes. Additionally we validated an unexpected influence of the IGF-receptor on EGF-endocytosis. IMPACT facilitates the selection of high-quality phenotypic profiles using different types of independent information, thereby supporting the molecular interpretation of functional screens.  相似文献   

7.
The symbioses between plants of the Rubiaceae and Primulaceae families with Burkholderia bacteria represent unique and intimate plant–bacterial relationships. Many of these interactions have been identified through PCR-dependent typing methods, but there is little information available about their functional and ecological roles. We assembled 17 new endophyte genomes representing endophytes from 13 plant species, including those of two previously unknown associations. Genomes of leaf endophytes belonging to Burkholderia s.l. show extensive signs of genome reduction, albeit to varying degrees. Except for one endophyte, none of the bacterial symbionts could be isolated on standard microbiological media. Despite their taxonomic diversity, all endophyte genomes contained gene clusters linked to the production of specialized metabolites, including genes linked to cyclitol sugar analog metabolism and in one instance non-ribosomal peptide synthesis. These genes and gene clusters are unique within Burkholderia s.l. and are likely horizontally acquired. We propose that the acquisition of secondary metabolite gene clusters through horizontal gene transfer is a prerequisite for the evolution of a stable association between these endophytes and their hosts.  相似文献   

8.
9.
10.
In traditional mutant screening approaches, genetic variants are tested for one or a small number of phenotypes. Once bona fide variants are identified, they are typically subjected to a limited number of secondary phenotypic screens. Although this approach is excellent at finding genes involved in specific biological processes, the lack of wide and systematic interrogation of phenotype limits the ability to detect broader syndromes and connections between genes and phenotypes. It could also prevent detection of the primary phenotype of a mutant. As part of a systems biology approach to understand plastid function, large numbers of Arabidopsis thaliana homozygous T-DNA lines are being screened with parallel morphological, physiological, and chemical phenotypic assays (www.plastid.msu.edu). To refine our approaches and validate the use of this high-throughput screening approach for understanding gene function and functional networks, approximately 100 wild-type plants and 13 known mutants representing a variety of phenotypes were analyzed by a broad range of assays including metabolite profiling, morphological analysis, and chlorophyll fluorescence kinetics. Data analysis using a variety of statistical approaches showed that such industrial approaches can reliably identify plant mutant phenotypes. More significantly, the study uncovered previously unreported phenotypes for these well-characterized mutants and unexpected associations between different physiological processes, demonstrating that this approach has strong advantages over traditional mutant screening approaches. Analysis of wild-type plants revealed hundreds of statistically robust phenotypic correlations, including metabolites that are not known to share direct biosynthetic origins, raising the possibility that these metabolic pathways have closer relationships than is commonly suspected.  相似文献   

11.
The long-established view of Wolbachia as reproductive parasites of insects is becoming complicated as an increasing number of papers describe a richer picture of Wolbachia-mediated phenotypes in insects. The search for the molecular basis for this phenotypic variability has been greatly aided by the recent sequencing of several Wolbachia genomes. These studies have revealed putative genes and pathways that are likely to be involved in the host-symbiont interaction. Whereas significant progress is being made from comparative genomic studies together with the use of model host systems like Drosophila, the ultimate linking of phenotype to genotype will require the development of genetic manipulation technology for both host and symbiont.  相似文献   

12.
Theileria annulata, an intracellular parasite of bovine lymphoid cells, induces substantial phenotypic alterations to its host cell including continuous proliferation, cytoskeletal changes and resistance to apoptosis. While parasite induced modulation of host cell signal transduction pathways and NFκB activation are established, there remains considerable speculation on the complexities of the parasite directed control mechanisms that govern these radical changes to the host cell. Our objectives in this study were to provide a comprehensive analysis of the global changes to host cell gene expression with emphasis on those that result from direct intervention by the parasite. By using comparative microarray analysis of an uninfected bovine cell line and its Theileria infected counterpart, in conjunction with use of the specific parasitacidal agent, buparvaquone, we have identified a large number of host cell gene expression changes that result from parasite infection. Our results indicate that the viable parasite can irreversibly modify the transformed phenotype of a bovine cell line. Fifty percent of genes with altered expression failed to show a reversible response to parasite death, a possible contributing factor to initiation of host cell apoptosis. The genes that did show an early predicted response to loss of parasite viability highlighted a sub-group of genes that are likely to be under direct control by parasite infection. Network and pathway analysis demonstrated that this sub-group is significantly enriched for genes involved in regulation of chromatin modification and gene expression. The results provide evidence that the Theileria parasite has the regulatory capacity to generate widespread change to host cell gene expression in a complex and largely irreversible manner.  相似文献   

13.
Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype–gene association matrix under the prior knowledge from phenotype similarity network and protein–protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype–gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein–protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways.  相似文献   

14.
15.
As the outward-most representation of life, phenotype is the fundamental basis with which humans understand life and disease. But with the advent of molecular and sequencing technique and research, a growing portion of science research focuses primarily on the molecular level of life. Our understanding in molecular variations and mechanisms can only be fully utilized when they are translated into the phenotypic level. In this study, we constructed similarity network for phenotype ontology, and then applied network analysis methods to discover phenotype/disease clusters. Then, we used machine learning models to predict protein-phenotype associations. Each protein was characterized by the functional profiles of its interaction neighbors on the protein-protein interaction network. Our methods can not only predict protein-phenotype associations, but also reveal the underlying mechanisms from protein to phenotype.  相似文献   

16.
Using profiles of phylogenetic profiles (P-cubic) we compared the evolutionary dynamics of different kinds of functional associations. Ordered from most to least evolutionarily stable, these associations were genes in the same operons, genes whose products participate in the same biochemical pathway, genes coding for physically interacting proteins and genes in the same regulons. Regulons showed the most plastic functional interactions with evolutionary stabilities barely better than those of unrelated genes. Further regulon analyses showed that global regulators contain less evolutionarily stable associations than local regulators. Genes co-repressed by global regulators had a higher evolutionary conservation than genes co-activated by global regulators. However, the reverse was true for genes co-repressed and co-activated by local regulators. Of all the regulon-related associations, the relationship between regulators and their target genes showed the most evolutionary stability. Different negative data sets built to contrast against each of the analysed kinds of modules also differed in evolutionary conservation revealing further underlying genome organization. Applying P-cubic analyses to other genomes might help visualize genome organization, understand the evolutionary importance and plasticity of functional associations and compare the quality of data sets expected to reflect functional interactions, such as those coming from high-throughput experiments.  相似文献   

17.
The limited knowledge of genomic diversity and functional genes associated with the traits of soybean varieties has resulted in slow progress in breeding.In this study,we sequenced the genomes of 250 soybean landraces and cultivars from China,America,and Europe,and investigated their population structure,genetic diversity and architecture,and the selective sweep regions of these accessions.Five novel agronomically important genes were identified,and the effects of functional mutations in respect...  相似文献   

18.
We present a prototype of a new database tool, GeneCensus, which focuses on comparing genomes globally, in terms of the collective properties of many genes, rather than in terms of the attributes of a single gene (e.g. sequence similarity for a particular ortholog). The comparisons are presented in a visual fashion over the web at GeneCensus.org. The system concentrates on two types of comparisons: (i) trees based on the sharing of generalized protein families between genomes, and (ii) whole pathway analysis in terms of activity levels. For the trees, we have developed a module (TreeViewer) that clusters genomes in terms of the folds, superfamilies or orthologs—all can be considered as generalized ‘families’ or ‘protein parts’—they share, and compares the resulting trees side-by-side with those built from sequence similarity of individual genes (e.g. a traditional tree built on ribosomal similarity). We also include comparisons to trees built on whole-genome dinucleotide or codon composition. For pathway comparisons, we have implemented a module (PathwayPainter) that graphically depicts, in selected metabolic pathways, the fluxes or expression levels of the associated enzymes (i.e. generalized ‘activities’). One can, consequently, compare organisms (and organism states) in terms of representations of these systemic quantities. Develop ment of this module involved compiling, calculating and standardizing flux and expression information from many different sources. We illustrate pathway analysis for enzymes involved in central metabolism. We are able to show that, to some degree, flux and expression fluctuations have characteristic values in different sections of the central metabolism and that control points in this system (e.g. hexokinase, pyruvate kinase, phosphofructokinase, isocitrate dehydrogenase and citric synthase) tend to be especially variable in flux and expression. Both the TreeViewer and PathwayPainter modules connect to other information sources related to individual-gene or organism properties (e.g. a single-gene structural annotation viewer).  相似文献   

19.
Phenotypes are investigated in model organisms to understand and reveal the molecular mechanisms underlying disease. Phenotype ontologies were developed to capture and compare phenotypes within the context of a single species. Recently, these ontologies were augmented with formal class definitions that may be utilized to integrate phenotypic data and enable the direct comparison of phenotypes between different species. We have developed a method to transform phenotype ontologies into a formal representation, combine phenotype ontologies with anatomy ontologies, and apply a measure of semantic similarity to construct the PhenomeNET cross-species phenotype network. We demonstrate that PhenomeNET can identify orthologous genes, genes involved in the same pathway and gene-disease associations through the comparison of mutant phenotypes. We provide evidence that the Adam19 and Fgf15 genes in mice are involved in the tetralogy of Fallot, and, using zebrafish phenotypes, propose the hypothesis that the mammalian homologs of Cx36.7 and Nkx2.5 lie in a pathway controlling cardiac morphogenesis and electrical conductivity which, when defective, cause the tetralogy of Fallot phenotype. Our method implements a whole-phenome approach toward disease gene discovery and can be applied to prioritize genes for rare and orphan diseases for which the molecular basis is unknown.  相似文献   

20.
Deciphering the genetic basis of human diseases is an important goal of biomedical research. On the basis of the assumption that phenotypically similar diseases are caused by functionally related genes, we propose a computational framework that integrates human protein–protein interactions, disease phenotype similarities, and known gene–phenotype associations to capture the complex relationships between phenotypes and genotypes. We develop a tool named CIPHER to predict and prioritize disease genes, and we show that the global concordance between the human protein network and the phenotype network reliably predicts disease genes. Our method is applicable to genetically uncharacterized phenotypes, effective in the genome‐wide scan of disease genes, and also extendable to explore gene cooperativity in complex diseases. The predicted genetic landscape of over 1000 human phenotypes, which reveals the global modular organization of phenotype–genotype relationships. The genome‐wide prioritization of candidate genes for over 5000 human phenotypes, including those with under‐characterized disease loci or even those lacking known association, is publicly released to facilitate future discovery of disease genes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号