首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
Guo X  Gao L  Wei C  Yang X  Zhao Y  Dong A 《PloS one》2011,6(9):e24171
The identification of disease-causing genes is a fundamental challenge in human health and of great importance in improving medical care, and provides a better understanding of gene functions. Recent computational approaches based on the interactions among human proteins and disease similarities have shown their power in tackling the issue. In this paper, a novel systematic and global method that integrates two heterogeneous networks for prioritizing candidate disease-causing genes is provided, based on the observation that genes causing the same or similar diseases tend to lie close to one another in a network of protein-protein interactions. In this method, the association score function between a query disease and a candidate gene is defined as the weighted sum of all the association scores between similar diseases and neighbouring genes. Moreover, the topological correlation of these two heterogeneous networks can be incorporated into the definition of the score function, and finally an iterative algorithm is designed for this issue. This method was tested with 10-fold cross-validation on all 1,126 diseases that have at least a known causal gene, and it ranked the correct gene as one of the top ten in 622 of all the 1,428 cases, significantly outperforming a state-of-the-art method called PRINCE. The results brought about by this method were applied to study three multi-factorial disorders: breast cancer, Alzheimer disease and diabetes mellitus type 2, and some suggestions of novel causal genes and candidate disease-causing subnetworks were provided for further investigation.  相似文献   

4.
Gene-based association tests aggregate genotypes across multiple variants for each gene, providing an interpretable gene-level analysis framework for genome-wide association studies (GWAS). Early gene-based test applications often focused on rare coding variants; a more recent wave of gene-based methods, e.g. TWAS, use eQTLs to interrogate regulatory associations. Regulatory variants are expected to be particularly valuable for gene-based analysis, since most GWAS associations to date are non-coding. However, identifying causal genes from regulatory associations remains challenging and contentious. Here, we present a statistical framework and computational tool to integrate heterogeneous annotations with GWAS summary statistics for gene-based analysis, applied with comprehensive coding and tissue-specific regulatory annotations. We compare power and accuracy identifying causal genes across single-annotation, omnibus, and annotation-agnostic gene-based tests in simulation studies and an analysis of 128 traits from the UK Biobank, and find that incorporating heterogeneous annotations in gene-based association analysis increases power and performance identifying causal genes.  相似文献   

5.
Immunoinformatics is an emerging new field that benefits from computational analyses and tools that facilitate the understanding of the immune system. A large number of immunoinformatics resources such as immune-related databases and analysis software are available through the World Wide Web for the benefit of the research community. However, immunoinformatics developments have sometimes remained isolated from mainstream bioinformatics. Therefore, there is clearly a need for integration, which will empower the exchange of data and annotations within the scientific community in a quick and efficient fashion. Here, we have chosen the Distributed Annotation System (DAS), for integrating in house annotations on experimental and predicted HLA I-restriction elements of CD8 T-cell epitopes with sequence and structural information.  相似文献   

6.
7.
8.
9.
Biological age measures outperform chronological age in predicting various aging outcomes, yet little is known regarding genetic predisposition. We performed genome‐wide association scans of two age‐adjusted biological age measures (PhenoAgeAcceleration and BioAgeAcceleration), estimated from clinical biochemistry markers (Levine et al., 2018; Levine, 2013) in European‐descent participants from UK Biobank. The strongest signals were found in the APOE gene, tagged by the two major protein‐coding SNPs, PhenoAgeAccel—rs429358 (APOE e4 determinant) (p = 1.50 × 10−72); BioAgeAccel—rs7412 (APOE e2 determinant) (p = 3.16 × 10−60). Interestingly, we observed inverse APOE e2 and e4 associations and unique pathway enrichments when comparing the two biological age measures. Genes associated with BioAgeAccel were enriched in lipid related pathways, while genes associated with PhenoAgeAccel showed enrichment for immune system, cell function, and carbohydrate homeostasis pathways, suggesting the two measures capture different aging domains. Our study reaffirms that aging patterns are heterogeneous across individuals, and the manner in which a person ages may be partly attributed to genetic predisposition.  相似文献   

10.
Systems biology approaches that are based on the genetics of gene expression have been fruitful in identifying genetic regulatory loci related to complex traits. We use microarray and genetic marker data from an F2 mouse intercross to examine the large-scale organization of the gene co-expression network in liver, and annotate several gene modules in terms of 22 physiological traits. We identify chromosomal loci (referred to as module quantitative trait loci, mQTL) that perturb the modules and describe a novel approach that integrates network properties with genetic marker information to model gene/trait relationships. Specifically, using the mQTL and the intramodular connectivity of a body weight–related module, we describe which factors determine the relationship between gene expression profiles and weight. Our approach results in the identification of genetic targets that influence gene modules (pathways) that are related to the clinical phenotypes of interest.  相似文献   

11.
Motivation: Staining the human metaphase chromosomes revealscharacteristic banding patterns known as cytogenetic bands orcytobands. Using technologies based on metaphase chromosomes,researchers have accumulated much knowledge about the correlationsbetween human diseases and specific cytoband aberrations, indicatingthe presence of disease-associated genes in those bands. Withthe progress of human genome project and techniques such asfluorescent in situ hybridization, many genes have been assignedto the cytobands and annotated in public databases, making itpossible to find all genes in the disease-related cytobandsthrough database queries. However, finding genes in cytobandsremains an imprecise process, partly due to the insufficiencyof current methods for cytoband queries, especially for thosebased on cytogenetic annotations. Results: By transforming the cytoband annotations into numericalsegments, a new query method is developed that is able to accuratelydefine any cytogenetic ranges in human chromosomes. A querysystem (designated cytoband query sys CQS) is implemented usingcytogenetic annotations in the public domain. Judged by a performancetest, CQS executed as accurately as expected using cytogeneticannotations from NCBI Map Viewer. The new method is scalableand can be applied to genomes from other species. Availability: The CQS is freely accessible over the Internetat http://moris.csie.ncku.edu.tw/cqs/ Contact: clh9{at}mail.ncku.edu.tw Supplementary information: http://moris.csie.ncku.edu.tw/cqs/  相似文献   

12.
Triticeae species (including wheat, barley and rye) have huge and complex genomes due to polyploidization and a high content of transposable elements (TEs). TEs are known to play a major role in the structure and evolutionary dynamics of Triticeae genomes. During the last 5 years, substantial stretches of contiguous genomic sequence from various species of Triticeae have been generated, making it necessary to update and standardize TE annotations and nomenclature. In this study we propose standard procedures for these tasks, based on structure, nucleic acid and protein sequence homologies. We report statistical analyses of TE composition and distribution in large blocks of genomic sequences from wheat and barley. Altogether, 3.8 Mb of wheat sequence available in the databases was analyzed or re-analyzed, and compared with 1.3 Mb of re-annotated genomic sequences from barley. The wheat sequences were relatively gene-rich (one gene per 23.9 kb), although wheat gene-derived sequences represented only 7.8% (159 elements) of the total, while the remainder mainly comprised coding sequences found in TEs (54.7%, 751 elements). Class I elements [mainly long terminal repeat (LTR) retrotransposons] accounted for the major proportion of TEs, in terms of sequence length as well as element number (83.6% and 498, respectively). In addition, we show that the gene-rich sequences of wheat genome A seem to have a higher TE content than those of genomes B and D, or of barley gene-rich sequences. Moreover, among the various TE groups, MITEs were most often associated with genes: 43.1% of MITEs fell into this category. Finally, the TRIM and copia elements were shown to be the most active TEs in the wheat genome. The implications of these results for the evolution of diploid and polyploid wheat species are discussed. Electronic Supplementary Material Supplementary material is available for this article at  相似文献   

13.
Piro RM  Di Cunto F 《The FEBS journal》2012,279(5):678-696
The identification of genes involved in human hereditary diseases often requires the time-consuming and expensive examination of a great number of possible candidate genes, since genome-wide techniques such as linkage analysis and association studies frequently select many hundreds of 'positional' candidates. Even considering the positive impact of next-generation sequencing technologies, the prioritization of candidate genes may be an important step for disease-gene identification. In this paper we develop a basic classification scheme for computational approaches to disease-gene prediction and apply it to exhaustively review bioinformatics tools that have been developed for this purpose, focusing on conceptual aspects rather than technical detail and performance. Finally, we discuss some past successes obtained by computational approaches to illustrate their beneficial contribution to medical research.  相似文献   

14.
MOTIVATION: Inferring the genetic interaction mechanism using Bayesian networks has recently drawn increasing attention due to its well-established theoretical foundation and statistical robustness. However, the relative insufficiency of experiments with respect to the number of genes leads to many false positive inferences. RESULTS: We propose a novel method to infer genetic networks by alleviating the shortage of available mRNA expression data with prior knowledge. We call the proposed method 'modularized network learning' (MONET). Firstly, the proposed method divides a whole gene set to overlapped modules considering biological annotations and expression data together. Secondly, it infers a Bayesian network for each module, and integrates the learned subnetworks to a global network. An algorithm that measures a similarity between genes based on hierarchy, specificity and multiplicity of biological annotations is presented. The proposed method draws a global picture of inter-module relationships as well as a detailed look of intra-module interactions. We applied the proposed method to analyze Saccharomyces cerevisiae stress data, and found several hypotheses to suggest putative functions of unclassified genes. We also compared the proposed method with a whole-set-based approach and two expression-based clustering approaches.  相似文献   

15.
Perez-Iratxeta C  Keer HS  Bork P  Andrade MA 《BioTechniques》2002,32(6):1380-2, 1384-5
The increase of information in biology makes it difficult for researchers in any field to keep current with the literature. The MEDLINE database of scientific abstracts can be quickly scanned using electronic mechanisms. Potentially interesting abstracts can be selected by matching words joined by Boolean operators. However this means of selecting documents is not optimal. Nonspecific queries have to be effected, resulting in large numbers of irrelevant abstracts that have to be manually scanned To facilitate this analysis, we have developed a system that compiles a summary of subjects and related documents on the results of a MEDLINE query. For this, we have applied a fuzzy binary relation formalism that deduces relations between words present in a set of abstracts preprocessed with a standard grammatical tagger. Those relations are used to derive ensembles of related words and their associated subsets of abstracts. The algorithm can be used publicly at http:// www.bork.embl-heidelberg.de/xplormed/.  相似文献   

16.
Citizen science initiatives have been increasingly used by researchers as a source of occurrence data to model the distribution of alien species. Since citizen science presence-only data suffer from some fundamental issues, efforts have been made to combine these data with those provided by scientifically structured surveys. Surprisingly, only a few studies proposing data integration evaluated the contribution of this process to the effective sampling of species' environmental niches and, consequently, its effect on model predictions on new time intervals. We relied on niche overlap analyses, machine learning classification algorithms and ecological niche models to compare the ability of data from citizen science and scientific surveys, along with their integration, in capturing the realized niche of 13 invasive alien species in Italy. Moreover, we assessed differences in current and future invasion risk predicted by each data set under multiple global change scenarios. We showed that data from citizen science and scientific surveys captured similar species niches though highlighting exclusive portions associated with clearly identifiable environmental conditions. In terrestrial species, citizen science data granted the highest gain in environmental space to the pooled niches, determining an increased future biological invasion risk. A few aquatic species modelled at the regional scale reported a net loss in the pooled niches compared to their scientific survey niches, suggesting that citizen science data may also lead to contraction in pooled niches. For these species, models predicted a lower future biological invasion risk. These findings indicate that citizen science data may represent a valuable contribution to predicting future spread of invasive alien species, especially within national-scale programmes. At the same time, citizen science data collected on species poorly known to citizen scientists, or in strictly local contexts, may strongly affect the niche quantification of these taxa and the prediction of their future biological invasion risk.  相似文献   

17.
Abstract

Conversion of cholesterol to pregnenolone is the rate-limiting step in steroidogenesis, which is mediated by StAR protein. The mammalian genome contains 15 START domain proteins (StARD1–StARD15) of which C-terminal cytosolic START domain of metastatic lymph node 64 (MLN64 or StARD3), is known to mobilize cholesterol and proposed to participate in steroidogenesis. Being a key in steroidogenesis, it is of interest to identify new inhibitors that are able to bind MLN64 protein. In the present study, we used ligand-based virtual screening approach to identify ligands from the ZINC database with D(?)-Tartaric Acid (TAR) serving as a template.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号