首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Zhao J  Yang TH  Huang Y  Holme P 《PloS one》2011,6(9):e24306
Many diseases have complex genetic causes, where a set of alleles can affect the propensity of getting the disease. The identification of such disease genes is important to understand the mechanistic and evolutionary aspects of pathogenesis, improve diagnosis and treatment of the disease, and aid in drug discovery. Current genetic studies typically identify chromosomal regions associated specific diseases. But picking out an unknown disease gene from hundreds of candidates located on the same genomic interval is still challenging. In this study, we propose an approach to prioritize candidate genes by integrating data of gene expression level, protein-protein interaction strength and known disease genes. Our method is based only on two, simple, biologically motivated assumptions--that a gene is a good disease-gene candidate if it is differentially expressed in cases and controls, or that it is close to other disease-gene candidates in its protein interaction network. We tested our method on 40 diseases in 58 gene expression datasets of the NCBI Gene Expression Omnibus database. On these datasets our method is able to predict unknown disease genes as well as identifying pleiotropic genes involved in the physiological cellular processes of many diseases. Our study not only provides an effective algorithm for prioritizing candidate disease genes but is also a way to discover phenotypic interdependency, cooccurrence and shared pathophysiology between different disorders.  相似文献   

2.
In comparison to other complex disease traits, alcoholism and alcohol abuse are influenced by the combined effects of many genes that alter susceptibility, phenotypic expression and associated morbidity, respectively. Many genetic studies, in both animal models and humans, have identified genetic intervals containing genes that influence alcoholism or behavioral responses to ethanol. Concurrently, a growing number of microarray studies have identified gene expression differences related to ethanol drinking or other ethanol behaviors. However, concerns about the statistical power of these experiments, combined with the complexity of the underlying phenotypes, have greatly hampered the identification of candidate genes underlying ethanol behaviors. Meta-analysis approaches using recent compilations of large datasets of microarray, behavioral and genetic data promise improved statistical power for detecting the genes or gene networks affecting ethanol behaviors and other complex traits.  相似文献   

3.
Genome-wide techniques such as microarray analysis, Serial Analysis of Gene Expression (SAGE), Massively Parallel Signature Sequencing (MPSS), linkage analysis and association studies are used extensively in the search for genes that cause diseases, and often identify many hundreds of candidate disease genes. Selection of the most probable of these candidate disease genes for further empirical analysis is a significant challenge. Additionally, identifying the genes that cause complex diseases is problematic due to low penetrance of multiple contributing genes. Here, we describe a novel bioinformatic approach that selects candidate disease genes according to their expression profiles. We use the eVOC anatomical ontology to integrate text-mining of biomedical literature and data-mining of available human gene expression data. To demonstrate that our method is successful and widely applicable, we apply it to a database of 417 candidate genes containing 17 known disease genes. We successfully select the known disease gene for 15 out of 17 diseases and reduce the candidate gene set to 63.3% (±18.8%) of its original size. This approach facilitates direct association between genomic data describing gene expression and information from biomedical texts describing disease phenotype, and successfully prioritizes candidate genes according to their expression in disease-affected tissues.  相似文献   

4.
Intracranial aneurysm (IA) is a complex genetic disease for which, to date, 10 loci have been identified by linkage. Identification of the risk-conferring genes in the loci has proven difficult, since the regions often contain several hundreds of genes. An approach to prioritize positional candidate genes for further studies is to use gene expression data from diseased and nondiseased tissue. Genes that are not expressed, either in diseased or nondiseased tissue, are ranked as unlikely to contribute to the disease. We demonstrate an approach for integrating expression and genetic mapping data to identify likely pathways involved in the pathogenesis of a disease. We used expression profiles for IAs and nonaneurysmal intracranial arteries (IVs) together with the 10 reported linkage intervals for IA. Expressed genes were analyzed for membership in Kyoto Encyclopedia of Genes and Genomes (KEGG) biological pathways. The 10 IA loci harbor 1,858 candidate genes, of which 1,561 (84%) were represented on the microarrays. We identified 810 positional candidate genes for IA that were expressed in IVs or IAs. Pathway information was available for 294 of these genes and involved 32 KEGG biological function pathways represented on at least 2 loci. A likelihood-based score was calculated to rank pathways for involvement in the pathogenesis of IA. Adherens junction, MAPK, and Notch signaling pathways ranked high. Integration of gene expression profiles with genetic mapping data for IA provides an approach to identify candidate genes that are more likely to function in the pathology of IA.  相似文献   

5.

Background

In the post genome era, a major goal of biology is the identification of specific roles for individual genes. We report a new genomic tool for gene characterization, the UCLA Gene Expression Tool (UGET).

Results

Celsius, the largest co-normalized microarray dataset of Affymetrix based gene expression, was used to calculate the correlation between all possible gene pairs on all platforms, and generate stored indexes in a web searchable format. The size of Celsius makes UGET a powerful gene characterization tool. Using a small seed list of known cartilage-selective genes, UGET extended the list of known genes by identifying 32 new highly cartilage-selective genes. Of these, 7 of 10 tested were validated by qPCR including the novel cartilage-specific genes SDK2 and FLJ41170. In addition, we retrospectively tested UGET and other gene expression based prioritization tools to identify disease-causing genes within known linkage intervals. We first demonstrated this utility with UGET using genetically heterogeneous disorders such as Joubert syndrome, microcephaly, neuropsychiatric disorders and type 2 limb girdle muscular dystrophy (LGMD2) and then compared UGET to other gene expression based prioritization programs which use small but discrete and well annotated datasets. Finally, we observed a significantly higher gene correlation shared between genes in disease networks associated with similar complex or Mendelian disorders.

Discussion

UGET is an invaluable resource for a geneticist that permits the rapid inclusion of expression criteria from one to hundreds of genes in genomic intervals linked to disease. By using thousands of arrays UGET annotates and prioritizes genes better than other tools especially with rare tissue disorders or complex multi-tissue biological processes. This information can be critical in prioritization of candidate genes for sequence analysis.  相似文献   

6.

Background

Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates.

Methodology/Principal Findings

We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.

Conclusion

Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.  相似文献   

7.
Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain hundreds of genes. However, in any disorder, most of the disease genes will be involved in only a few different molecular pathways. If we know something about the relationships between the genes, we can assess whether some genes (which may reside in different loci) functionally interact with each other, indicating a joint basis for the disease etiology. There are various repositories of information on pathway relationships. To consolidate this information, we developed a functional human gene network that integrates information on genes and the functional relationships between genes, based on data from the Kyoto Encyclopedia of Genes and Genomes, the Biomolecular Interaction Network Database, Reactome, the Human Protein Reference Database, the Gene Ontology database, predicted protein-protein interactions, human yeast two-hybrid interactions, and microarray co-expressions. We applied this network to interrelate positional candidate genes from different disease loci and then tested 96 heritable disorders for which the Online Mendelian Inheritance in Man database reported at least three disease genes. Artificial susceptibility loci, each containing 100 genes, were constructed around each disease gene, and we used the network to rank these genes on the basis of their functional interactions. By following up the top five genes per artificial locus, we were able to detect at least one known disease gene in 54% of the loci studied, representing a 2.8-fold increase over random selection. This suggests that our method can significantly reduce the cost and effort of pinpointing true disease genes in analyses of disorders for which numerous loci have been reported but for which most of the genes are unknown.  相似文献   

8.
Xu Y  Duanmu H  Chang Z  Zhang S  Li Z  Li Z  Liu Y  Li K  Qiu F  Li X 《Molecular biology reports》2012,39(2):1627-1637
Copy number variations (CNVs) are one type of the human genetic variations and are pervasive in the human genome. It has been confirmed that they can play a causal role in complex diseases. Previous studies of CNVs focused more on identifying the disease-specific CNV regions or candidate genes on these CNV regions, but less on the synergistic actions between genes on CNV regions and other genes. Our research combined the CNVs with related gene co-expression to reconstruct gene co-expression network by using single nucleotide polymorphism microarray datasets and gene microarray datasets of breast cancer, and then extracted the modules which connected densely inside and analyzed the functions of modules. Interestingly, all of these modules’ functions were related to breast cancer according to our enrichment analysis, and most of the genes in these modules have been reported to be involved in breast cancer. Our findings suggested that integrating CNVs and gene co-expressed relations was an available way to analyze the roles of CNV genes and their synergistic genes in breast cancer, and provided a novel insight into the pathological mechanism of breast cancer.  相似文献   

9.
Chronic obstructive pulmonary disease (COPD) is a complex human disease likely influenced by multiple genes, cigarette smoking, and gene-by-smoking interactions, but only severe alpha 1-antitrypsin deficiency is a proven genetic risk factor for COPD. Prior linkage analyses in the Boston Early-Onset COPD Study have demonstrated significant linkage to a key intermediate phenotype of COPD on chromosome 2q. We integrated results from murine lung development and human COPD gene-expression microarray studies with human COPD linkage results on chromosome 2q to prioritize candidate-gene selection, thus identifying SERPINE2 as a positional candidate susceptibility gene for COPD. Immunohistochemistry demonstrated expression of serpine2 protein in mouse and human adult lung tissue. In family-based association testing of 127 severe, early-onset COPD pedigrees from the Boston Early-Onset COPD Study, we observed significant association with COPD phenotypes and 18 single-nucleotide polymorphisms (SNPs) in the SERPINE2 gene. Association of five of these SNPs with COPD was replicated in a case-control analysis, with cases from the National Emphysema Treatment Trial and controls from the Normative Aging Study. Family-based and case-control haplotype analyses supported similar regions of association within the SERPINE2 gene. When significantly associated SNPs in these haplotypic regions were included as covariates in linkage models, LOD score attenuation was observed most markedly in a smokers-only linkage model (LOD 4.41, attenuated to 1.74). After the integration of murine and human microarray data to inform candidate-gene selection, we observed significant family-based association and independent replication of association in a case-control study, suggesting that SERPINE2 is a COPD-susceptibility gene and is likely influenced by gene-by-smoking interaction.  相似文献   

10.
Huang HL  Lee CC  Ho SY 《Bio Systems》2007,90(1):78-86
It is essential to select a minimal number of relevant genes from microarray data while maximizing classification accuracy for the development of inexpensive diagnostic tests. However, it is intractable to simultaneously optimize gene selection and classification accuracy that is a large parameter optimization problem. We propose an efficient evolutionary approach to gene selection from microarray data which can be combined with the optimal design of various multiclass classifiers. The proposed method (named GeneSelect) consists of three parts which are fully cooperated: an efficient encoding scheme of candidate solutions, a generalized fitness function, and an intelligent genetic algorithm (IGA). An existing hybrid approach based on genetic algorithm and maximum likelihood classification (GA/MLHD) is proposed to select a small number of relevant genes for accurate classification of samples. To evaluate the performance of GeneSelect, the gene selection is combined with the same maximum likelihood classification (named IGA/MLHD) for convenient comparisons. The performance of IGA/MLHD is applied to 11 cancer-related human gene expression datasets. The simulation results show that IGA/MLHD is superior to GA/MLHD in terms of the number of selected genes, classification accuracy, and robustness of selected genes and accuracy.  相似文献   

11.
Linkage studies of complex traits frequently yield multiple linkage regions covering hundreds of genes. Testing each candidate gene from every region is prohibitively expensive and computational methods that simplify this process would benefit genetic research. We present a new method based on commonality of functional annotation (CFA) that aids dissection of complex traits for which multiple causal genes act in a single pathway or process. CFA works by testing individual Gene Ontology (GO) terms for enrichment among candidate gene pools, performs multiple hypothesis testing adjustment using an estimate of independent tests based on correlation of GO terms, and then scores and ranks genes annotated with significantly-enriched terms based on the number of quantitative trait loci regions in which genes bearing those annotations appear. We evaluate CFA using simulated linkage data and show that CFA has good power despite being conservative. We apply CFA to published linkage studies investigating age-of-onset of Alzheimer's disease and body mass index and obtain previously known and new candidate genes. CFA provides a new tool for studies in which causal genes are expected to participate in a common pathway or process and can easily be extended to utilize annotation schemes in addition to the GO.  相似文献   

12.

Background

Genome-wide association studies (GWASs) and global profiling of gene expression (microarrays) are two major technological breakthroughs that allow hypothesis-free identification of candidate genes associated with tumorigenesis. It is not obvious whether there is a consistency between the candidate genes identified by GWAS (GWAS genes) and those identified by profiling gene expression (microarray genes).

Methodology/Principal Findings

We used the Cancer Genetic Markers Susceptibility database to retrieve single nucleotide polymorphisms from candidate genes for prostate cancer. In addition, we conducted a large meta-analysis of gene expression data in normal prostate and prostate tumor tissue. We identified 13,905 genes that were interrogated by both GWASs and microarrays. On the basis of P values from GWASs, we selected 1,649 most significantly associated genes for functional annotation by the Database for Annotation, Visualization and Integrated Discovery. We also conducted functional annotation analysis using same number of the top genes identified in the meta-analysis of the gene expression data. We found that genes involved in cell adhesion were overrepresented among both the GWAS and microarray genes.

Conclusions/Significance

We conclude that the results of these analyses suggest that combining GWAS and microarray data would be a more effective approach than analyzing individual datasets and can help to refine the identification of candidate genes and functions associated with tumor development.  相似文献   

13.
14.
Systemic lupus erythematosus is the prototype multisystem autoimmune disease. A strong genetic component of susceptibility to the disease is well established. Studies of murine models of systemic lupus erythematosus have shown complex genetic interactions that influence both susceptibility and phenotypic expression. These models strongly suggest that several defects in similar pathways, e.g. clearance of immune complexes and/or apoptotic cell debris, can all result in disease expression. Studies in humans have found linkage to several overlapping regions on chromosome 1q, although the precise susceptibility gene or genes in these regions have yet to be identified. Recent studies of candidate genes, including Fcγ receptors, IL-6, and tumour necrosis factor-α, suggest that in human disease, genetic factors do play a role in disease susceptibility and clinical phenotype. The precise gene or genes involved and the strength of their influence do, however, appear to differ considerably in different populations.  相似文献   

15.
Rheumatoid arthritis is a chronic inflammatory disease with a high prevalence and substantial socioeconomic burden. Despite intense research efforts, its aetiology and pathogenesis remain poorly understood. To identify novel genes and/or cellular pathways involved in the pathogenesis of the disease, we utilized a well-recognized tumour necrosis factor-driven animal model of this disease and performed high-throughput expression profiling with subtractive cDNA libraries and oligonucleotide microarray hybridizations, coupled with independent statistical analysis. This twin approach was validated by a number of different methods in other animal models of arthritis as well as in human patient samples, thus creating a unique list of disease modifiers of potential therapeutic value. Importantly, and through the integration of genetic linkage analysis and Gene Ontology–assisted functional discovery, we identified the gelsolin-driven synovial fibroblast cytoskeletal rearrangements as a novel pathophysiological determinant of the disease.  相似文献   

16.
Most familial behavioral phenotypes result from the complex interaction of multiple genes. Studies of such phenotypes involving human subjects are often inconclusive owing to complexity of causation and experimental limitations. Studies of animal models argue for the use of established genetic strains as a powerful tool for genetic dissection of behavioral disorders and have led to the identification of rare genes and genetic mechanisms implicated in such phenotypes. We have used microarrays to study global gene expression in adult brains of four genetic strains of mice (C57BL/6J, DBA/2J, A/J, and BALB/c). Our results demonstrate that different strains show expression differences for a number of genes in the brain, and that closely related strains have similar patterns of gene expression as compared with distantly related strains. In addition, among the 24 000 genes and ESTs on the microarray, 77 showed at least a 1.5-fold increase in the brains of C57BL/6J mice as compared with those of DBA/2J mice. These genes fall into such functional categories as gene regulation, metabolism, cell signaling, neurotransmitter transport, and DNA/RNA binding. The importance of these findings as a novel genetic resource and their use and application in the genetic analysis of complex behavioral phenotypes, susceptibilities, and responses to drugs and chemicals are discussed.  相似文献   

17.
Pulmonary fibrosis is a progressive disorder whose molecular pathology is poorly understood. Here we developed an in-house cDNA microarray ("lung chip") originating from a lung-normalized cDNA library. By using this lung chip, we analyzed global gene expression in a murine model of bleomycin-induced fibrosis and selected 82 genes that differed by more than twofold intensity in at least one pairwise comparison with controls. Cluster analysis of these selected genes showed that the expression of genes associated with inflammation reached maximum levels at 5 days after bleomycin administration, while genes involved in the development of fibrosis increased gradually up to 14 days after bleomycin treatment. These changes in gene expression signature were well correlated with observed histopathological changes. The results show that microarray analysis of animal disease models is a powerful approach to understanding the gene expression programs that underlie these disorders.  相似文献   

18.
Expression QTL mapping by integrating genome-wide gene expression and genotype data is a promising approach to identifying functional genetic variation, but is hampered by the large number of multiple comparisons inherent in such studies. A novel approach to addressing multiple testing problems in genome-wide family-based association studies is screening candidate markers using heritability or conditional power. We apply these methods in settings in which microarray gene expression data are used as phenotypes, screening for SNPs near the expressed genes. We perform association analyses for phenotypes using a univariate approach. We also perform simulations on trios with large numbers of causal SNPs to determine the optimal number of markers to use in a screen. We demonstrate that our family-based screening approach performs well in the analysis of integrative genomic datasets and that screening using either heritability or conditional power produces similar, though not identical, results.  相似文献   

19.
20.
This report constitutes the seventh update of the human obesity gene map incorporating published results up to the end of October 2000. Evidence from the rodent and human obesity cases caused by single‐gene mutations, Mendelian disorders exhibiting obesity as a clinical feature, quantitative trait loci uncovered in human genome‐wide scans and in cross‐breeding experiments in various animal models, and association and linkage studies with candidate genes and other markers are reviewed. Forty‐seven human cases of obesity caused by single‐gene mutations in six different genes have been reported in the literature to date. Twenty‐four Mendelian disorders exhibiting obesity as one of their clinical manifestations have now been mapped. The number of different quantitative trait loci reported from animal models currently reaches 115. Attempts to relate DNA sequence variation in specific genes to obesity phenotypes continue to grow, with 130 studies reporting positive associations with 48 candidate genes. Finally, 59 loci have been linked to obesity indicators in genomic scans and other linkage study designs. The obesity gene map reveals that putative loci affecting obesity‐related phenotypes can be found on all chromosomes except chromosome Y. A total of 54 new loci have been added to the map in the past 12 months and the number of genes, markers, and chromosomal regions that have been associated or linked with human obesity phenotypes is now above 250. Likewise, the number of negative studies, which are only partially reviewed here, is also on the rise.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号