首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.

Background

Complex diseases are associated with altered interactions between thousands of genes. We developed a novel method to identify and prioritize disease genes, which was generally applicable to complex diseases.

Results

We identified modules of highly interconnected genes in disease-specific networks derived from integrating gene-expression and protein interaction data. We examined if those modules were enriched for disease-associated SNPs, and could be used to find novel genes for functional studies. First, we analyzed publicly available gene expression microarray and genome-wide association study (GWAS) data from 13, highly diverse, complex diseases. In each disease, highly interconnected genes formed modules, which were significantly enriched for genes harboring disease-associated SNPs. To test if such modules could be used to find novel genes for functional studies, we repeated the analyses using our own gene expression microarray and GWAS data from seasonal allergic rhinitis. We identified a novel gene, FGF2, whose relevance was supported by functional studies using combined small interfering RNA-mediated knock-down and gene expression microarrays. The modules in the 13 complex diseases analyzed here tended to overlap and were enriched for pathways related to oncological, metabolic and inflammatory diseases. This suggested that this union of the modules would be associated with a general increase in susceptibility for complex diseases. Indeed, we found that this union was enriched with GWAS genes for 145 other complex diseases.

Conclusions

Modules of highly interconnected complex disease genes were enriched for disease-associated SNPs, and could be used to find novel genes for functional studies.  相似文献   

2.
3.

Background

Genes encoding cytokine mediators are prime candidates for genetic analysis in conditions with T-helper (Th) cell disease driven imbalance. Idiopathic Pulmonary Fibrosis (IPF) is a predominantly Th2 mediated disease associated with a paucity of interferon-gamma (IFN-γ). The paucity of IFN-γ may favor the development of progressive fibrosis in IPF. Interleukin-12 (IL-12) plays a key role in inducing IFN-γ production. The aim of the current study was to assess whether the 1188 (A/C) 3'UTR single nucleotide polymorphism (SNP) in the IL-12 p40 subunit gene which was recently found to be functional and the 5644 (G/A) 3' UTR SNP of the IFN-γ gene were associated with susceptibility to IPF.

Methods

We investigated the allelic distribution in these loci in UK white Caucasoid subjects comprising 73 patients with IPF and 157 healthy controls. The SNPs were determined using the polymerase chain reaction in association with sequence-specific primers incorporating mismatches at the 3'-end.

Results

Our results showed that these polymorphisms were distributed similarly in the IPF and control groups

Conclusion

We conclude that these two potentially important candidate gene single nucleotide polymorphisms are not associated with susceptibility to IPF.  相似文献   

4.
5.

Background

The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged [1]. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists.

Methods

We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether a-posteriori two prognosis groups are separable on the evidence of the gene lists. A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset.

Results

The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results.

Conclusion

The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers. However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses. We conclude that many of the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.  相似文献   

6.
7.

Background

The EST database provides a rich resource for gene discovery and in silico expression analysis. We report a novel computational approach to identify co-expressed genes using EST database, and its application to IL-8.

Results

IL-8 is represented in 53 dbEST cDNA libraries. We calculated the frequency of occurrence of all the genes represented in these cDNA libraries, and ranked the candidates based on a Z-score. Additional analysis suggests that most IL-8 related genes are differentially expressed between non-tumor and tumor tissues. To focus on IL-8's function in tumor tissues, we further analyzed and ranked the genes in 16 IL-8 related tumor libraries.

Conclusions

This method generated a reference database for genes co-expressed with IL-8 and could facilitate further characterization of functional association among genes.  相似文献   

8.
9.
A genome-wide association study of seed protein and oil content in soybean   总被引:8,自引:0,他引:8  

Background

Association analysis is an alternative to conventional family-based methods to detect the location of gene(s) or quantitative trait loci (QTL) and provides relatively high resolution in terms of defining the genome position of a gene or QTL. Seed protein and oil concentration are quantitative traits which are determined by the interaction among many genes with small to moderate genetic effects and their interaction with the environment. In this study, a genome-wide association study (GWAS) was performed to identify quantitative trait loci (QTL) controlling seed protein and oil concentration in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content.

Results

A total of 55,159 single nucleotide polymorphisms (SNPs) were genotyped using various methods including Illumina Infinium and GoldenGate assays and 31,954 markers with minor allele frequency >0.10 were used to estimate linkage disequilibrium (LD) in heterochromatic and euchromatic regions. In euchromatic regions, the mean LD (r 2 ) rapidly declined to 0.2 within 360 Kbp, whereas the mean LD declined to 0.2 at 9,600 Kbp in heterochromatic regions. The GWAS results identified 40 SNPs in 17 different genomic regions significantly associated with seed protein. Of these, the five SNPs with the highest associations and seven adjacent SNPs were located in the 27.6-30.0 Mbp region of Gm20. A major seed protein QTL has been previously mapped to the same location and potential candidate genes have recently been identified in this region. The GWAS results also detected 25 SNPs in 13 different genomic regions associated with seed oil. Of these markers, seven SNPs had a significant association with both protein and oil.

Conclusions

This research indicated that GWAS not only identified most of the previously reported QTL controlling seed protein and oil, but also resulted in narrower genomic regions than the regions reported as containing these QTL. The narrower GWAS-defined genome regions will allow more precise marker-assisted allele selection and will expedite positional cloning of the causal gene(s).  相似文献   

10.
11.
12.

Key message

Using GWAS approaches, we detected independent resistant markers in sugarcane towards a vectored virus disease. Based on comparative genomics, several candidate genes potentially involved in virus/aphid/plant interactions were pinpointed.

Abstract

Yellow leaf of sugarcane is an emerging viral disease whose causal agent is a Polerovirus, the Sugarcane yellow leaf virus (SCYLV) transmitted by aphids. To identify quantitative trait loci controlling resistance to yellow leaf which are of direct relevance for breeding, we undertook a genome-wide association study (GWAS) on a sugarcane cultivar panel (n = 189) representative of current breeding germplasm. This panel was fingerprinted with 3,949 polymorphic markers (DArT and AFLP). The panel was phenotyped for SCYLV infection in leaves and stalks in two trials for two crop cycles, under natural disease pressure prevalent in Guadeloupe. Mixed linear models including co-factors representing population structure fixed effects and pairwise kinship random effects provided an efficient control of the risk of inflated type-I error at a genome-wide level. Six independent markers were significantly detected in association with SCYLV resistance phenotype. These markers explained individually between 9 and 14 % of the disease variation of the cultivar panel. Their frequency in the panel was relatively low (8–20 %). Among them, two markers were detected repeatedly across the GWAS exercises based on the different disease resistance parameters. These two markers could be blasted on Sorghum bicolor genome and candidate genes potentially involved in plant–aphid or plant–virus interactions were localized in the vicinity of sorghum homologs of sugarcane markers. Our results illustrate the potential of GWAS approaches to prospect among sugarcane germplasm for accessions likely bearing resistance alleles of significant effect useful in breeding programs.  相似文献   

13.

Background

Genome-wide association studies (GWAS) have become a common approach to identifying single nucleotide polymorphisms (SNPs) associated with complex diseases. As complex diseases are caused by the joint effects of multiple genes, while the effect of individual gene or SNP is modest, a method considering the joint effects of multiple SNPs can be more powerful than testing individual SNPs. The multi-SNP analysis aims to test association based on a SNP set, usually defined based on biological knowledge such as gene or pathway, which may contain only a portion of SNPs with effects on the disease. Therefore, a challenge for the multi-SNP analysis is how to effectively select a subset of SNPs with promising association signals from the SNP set.

Results

We developed the Optimal P-value Threshold Pedigree Disequilibrium Test (OPTPDT). The OPTPDT uses general nuclear families. A variable p-value threshold algorithm is used to determine an optimal p-value threshold for selecting a subset of SNPs. A permutation procedure is used to assess the significance of the test. We used simulations to verify that the OPTPDT has correct type I error rates. Our power studies showed that the OPTPDT can be more powerful than the set-based test in PLINK, the multi-SNP FBAT test, and the p-value based test GATES. We applied the OPTPDT to a family-based autism GWAS dataset for gene-based association analysis and identified MACROD2-AS1 with genome-wide significance (p-value= 2.5 × 10− 6).

Conclusions

Our simulation results suggested that the OPTPDT is a valid and powerful test. The OPTPDT will be helpful for gene-based or pathway association analysis. The method is ideal for the secondary analysis of existing GWAS datasets, which may identify a set of SNPs with joint effects on the disease.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1620-3) contains supplementary material, which is available to authorized users.  相似文献   

14.
15.

Key message

Using association and linkage mapping, two SNP markers closely linked to the SBWMV resistance gene on chromosome 5D were identified and can be used to select the gene in breeding.

Abstract

Soil-borne wheat mosaic virus (SBWMV) disease is a serious viral disease of winter wheat growing areas worldwide. SBWMV infection can significantly reduce grain yield up to 80 %. Developing resistant wheat cultivars is the only feasible strategy to reduce the losses. In this study, wheat Infinium iSelect Beadchips with 9 K wheat SNPs were used to genotype an association mapping population of 205 wheat accessions. Six new SNPs from two genes were identified to be significantly associated with the gene for SBWMV resistance on chromosome 5D. The SNPs and Xgwm469, an SSR marker that has been reported to be associated with the gene, were mapped close to the gene using F6-derived recombinant inbred lines from the cross between a resistant parent ‘Heyne’ and a susceptible parent ‘Trego’. Two representative SNPs, wsnp_CAP11_c209_198467 and wsnp_JD_c4438_5568170, from the two linked genes in wheat were converted into KBioscience Competitive Allele-Specific Polymerase assays and can be easily used in marker-assisted selection to improve wheat resistance to SBWMV in breeding.  相似文献   

16.

Background

Evaluating copy numbers of given genes in Plasmodium falciparum parasites is of major importance for laboratory-based studies or epidemiological surveys. For instance, pfmdr1 gene amplification has been associated with resistance to quinine derivatives and several genes involved in anti-oxidant defence may play an important role in resistance to antimalarial drugs, although their potential involvement has been overlooked.

Methods

TheΔΔCt method of relative quantification using real-time quantitative PCR with SYBR Green I detection was adapted and optimized to estimate copy numbers of three genes previously indicated as putative candidates of resistance to quinolines and artemisinin derivatives: pfmdr1, pfatp6 (SERCA) and pftctp, and in six further genes involved in oxidative stress responses.

Results

Using carefully designed specific RT-qPCR oligonucleotides, the methods were optimized for each gene and validated by the accurate measure of previously known number of copies of the pfmdr1 gene in the laboratory reference strains P. falciparum 3D7 and Dd2. Subsequently, Standard Operating Procedures (SOPs) were developed to the remaining genes under study and successfully applied to DNA obtained from dried filter blood spots of field isolates of P. falciparum collected in São Tomé & Principe, West Africa.

Conclusion

The SOPs reported here may be used as a high throughput tool to investigate the role of these drug resistance gene candidates in laboratory studies or large scale epidemiological surveys.  相似文献   

17.
18.

Background

The interaction between loci to affect phenotype is called epistasis. It is strict epistasis if no proper subset of the interacting loci exhibits a marginal effect. For many diseases, it is likely that unknown epistatic interactions affect disease susceptibility. A difficulty when mining epistatic interactions from high-dimensional datasets concerns the curse of dimensionality. There are too many combinations of SNPs to perform an exhaustive search. A method that could locate strict epistasis without an exhaustive search can be considered the brass ring of methods for analyzing high-dimensional datasets.

Methodology/Findings

A SNP pattern is a Bayesian network representing SNP-disease relationships. The Bayesian score for a SNP pattern is the probability of the data given the pattern, and has been used to learn SNP patterns. We identified a bound for the score of a SNP pattern. The bound provides an upper limit on the Bayesian score of any pattern that could be obtained by expanding a given pattern. We felt that the bound might enable the data to say something about the promise of expanding a 1-SNP pattern even when there are no marginal effects. We tested the bound using simulated datasets and semi-synthetic high-dimensional datasets obtained from GWAS datasets. We found that the bound was able to dramatically reduce the search time for strict epistasis. Using an Alzheimer''s dataset, we showed that it is possible to discover an interaction involving the APOE gene based on its score because of its large marginal effect, but that the bound is most effective at discovering interactions without marginal effects.

Conclusions/Significance

We conclude that the bound appears to ameliorate the curse of dimensionality in high-dimensional datasets. This is a very consequential result and could be pivotal in our efforts to reveal the dark matter of genetic disease risk from high-dimensional datasets.  相似文献   

19.

Key message

Thirty significant associations between 22 SNPs and five plant architecture component traits in Chinese upland cotton were identified via GWAS. Four peak SNP loci located on chromosome D03 were simultaneously associated with more plant architecture component traits. A candidate gene, Gh_D03G0922, might be responsible for plant height in upland cotton.

Abstract

A compact plant architecture is increasingly required for mechanized harvesting processes in China. Therefore, cotton plant architecture is an important trait, and its components, such as plant height, fruit branch length and fruit branch angle, affect the suitability of a cultivar for mechanized harvesting. To determine the genetic basis of cotton plant architecture, a genome-wide association study (GWAS) was performed using a panel composed of 355 accessions and 93,250 single nucleotide polymorphisms (SNPs) identified using the specific-locus amplified fragment sequencing method. Thirty significant associations between 22 SNPs and five plant architecture component traits were identified via GWAS. Most importantly, four peak SNP loci located on chromosome D03 were simultaneously associated with more plant architecture component traits, and these SNPs were harbored in one linkage disequilibrium block. Furthermore, 21 candidate genes for plant architecture were predicted in a 0.95-Mb region including the four peak SNPs. One of these genes (Gh_D03G0922) was near the significant SNP D03_31584163 (8.40 kb), and its Arabidopsis homologs contain MADS-box domains that might be involved in plant growth and development. qRT-PCR showed that the expression of Gh_D03G0922 was upregulated in the apical buds and young leaves of the short and compact cotton varieties, and virus-induced gene silencing (VIGS) proved that the silenced plants exhibited increased PH. These results indicate that Gh_D03G0922 is likely the candidate gene for PH in cotton. The genetic variations and candidate genes identified in this study lay a foundation for cultivating moderately short and compact varieties in future Chinese cotton-breeding programs.
  相似文献   

20.
A BAC-based integrated linkage map of the silkworm Bombyx mori   总被引:3,自引:0,他引:3  

Background

In 2004, draft sequences of the model lepidopteran Bombyx mori were reported using whole-genome shotgun sequencing. Because of relatively shallow genome coverage, the silkworm genome remains fragmented, hampering annotation and comparative genome studies. For a more complete genome analysis, we developed extended scaffolds combining physical maps with improved genetic maps.

Results

We mapped 1,755 single nucleotide polymorphism (SNP) markers from bacterial artificial chromosome (BAC) end sequences onto 28 linkage groups using a recombining male backcross population, yielding an average inter-SNP distance of 0.81 cM (about 270 kilobases). We constructed 6,221 contigs by fingerprinting clones from three BAC libraries digested with different restriction enzymes, and assigned a total of 724 single copy genes to them by BLAST (basic local alignment search tool) search of the BAC end sequences and high-density BAC filter hybridization using expressed sequence tags as probes. We assigned 964 additional expressed sequence tags to linkage groups by restriction fragment length polymorphism analysis of a nonrecombining female backcross population. Altogether, 361.1 megabases of BAC contigs and singletons were integrated with a map containing 1,688 independent genes. A test of synteny using Oxford grid analysis with more than 500 silkworm genes revealed six versus 20 silkworm linkage groups containing eight or more orthologs of Apis versus Tribolium, respectively.

Conclusion

The integrated map contains approximately 10% of predicted silkworm genes and has an estimated 76% genome coverage by BACs. This provides a new resource for improved assembly of whole-genome shotgun data, gene annotation and positional cloning, and will serve as a platform for comparative genomics and gene discovery in Lepidoptera and other insects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号