首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: The genome of Arabidopsis thaliana, which has the best understood plant genome, still has approximately one-third of its genes with no functional annotation at all from either MIPS or TAIR. We have applied our Data Mining Prediction (DMP) method to the problem of predicting the functional classes of these protein sequences. This method is based on using a hybrid machine-learning/data-mining method to identify patterns in the bioinformatic data about sequences that are predictive of function. We use data about sequence, predicted secondary structure, predicted structural domain, InterPro patterns, sequence similarity profile and expressions data. RESULTS: We predicted the functional class of a high percentage of the Arabidopsis genes with currently unknown function. These predictions are interpretable and have good test accuracies. We describe in detail seven of the rules produced.  相似文献   

2.
3.

Background  

Biological processes are carried out by coordinated modules of interacting molecules. As clustering methods demonstrate that genes with similar expression display increased likelihood of being associated with a common functional module, networks of coexpressed genes provide one framework for assigning gene function. This has informed the guilt-by-association (GBA) heuristic, widely invoked in functional genomics. Yet although the idea of GBA is accepted, the breadth of GBA applicability is uncertain.  相似文献   

4.
5.
6.
7.
Molecular biology has provided the means to identify parasite proteins, to define their function, patterns of expression and the means to produce them in quantity for subsequent functional analyses. Whole genome and expressed sequence tag programmes, and the parallel development of powerful bioinformatics tools, allow the execution of genome-wide between stage or species comparisons and meaningful gene-expression profiling. The latter can be undertaken with several new technologies such as DNA microarray and serial analysis of gene expression. Proteome analysis has come to the fore in recent years providing a crucial link between the gene and its protein product. RNA interference and ballistic gene transfer are exciting developments which can provide the means to precisely define the function of individual genes and, of importance in devising novel parasite control strategies, the effect that gene knockdown will have on parasite survival.  相似文献   

8.
9.
目前微阵列数据分析方法都基于具有相似表达模式的基因可能具有相近的生物学功能这一假设, 而实际上参与同一生物学功能的基因, 在表达时间和空间上是有关联的, 而并非表现为相似模式。利用水稻cDNA微阵列, 对水稻在ABA及干旱、寒冷和高盐胁迫条件下的基因表达进行了研究。选取环境胁迫和ABA应答的相关基因, 采用最短路径法(shortest path), 利用自行编制的计算软件, 在表达模式不直接相关的基因之间构建最短路径。研究表明, 通过分析这些基因的表达数据, 可以发现它们在功能上的关联性, 并对未知基因的功能预测进行了探索, 为构建水稻在ABA和环境胁迫条件下的分子应答网络奠定了基础。  相似文献   

10.
11.
About 40% of the proteins encoded in eukaryotic genomes are proteins of unknown function (PUFs). Their functional characterization remains one of the main challenges in modern biology. In this study we identified the PUF encoding genes from Arabidopsis (Arabidopsis thaliana) using a combination of sequence similarity, domain-based, and empirical approaches. Large-scale gene expression analyses of 1,310 publicly available Affymetrix chips were performed to associate the identified PUF genes with regulatory networks and biological processes of known function. To generate quality results, the study was restricted to expression sets with replicated samples. First, genome-wide clustering and gene function enrichment analysis of clusters allowed us to associate 1,541 PUF genes with tightly coexpressed genes for proteins of known function (PKFs). Over 70% of them could be assigned to more specific biological process annotations than the ones available in the current Gene Ontology release. The most highly overrepresented functional categories in the obtained clusters were ribosome assembly, photosynthesis, and cell wall pathways. Interestingly, the majority of the PUF genes appeared to be controlled by the same regulatory networks as most PKF genes, because clusters enriched in PUF genes were extremely rare. Second, large-scale analysis of differentially expressed genes was applied to identify a comprehensive set of abiotic stress-response genes. This analysis resulted in the identification of 269 PKF and 104 PUF genes that responded to a wide variety of abiotic stresses, whereas 608 PKF and 206 PUF genes responded predominantly to specific stress treatments. The provided coexpression and differentially expressed gene data represent an important resource for guiding future functional characterization experiments of PUF and PKF genes. Finally, the public Plant Gene Expression Database (http://bioweb.ucr.edu/PED) was developed as part of this project to provide efficient access and mining tools for the vast gene expression data of this study.  相似文献   

12.
The INO2 gene of Saccharomyces cerevisiae is required for expression of most of the phospholipid biosynthetic genes. INO2 expression is regulated by a complex cascade that includes autoregulation, Opi1p-mediated repression and Ume6p-mediated activation. To screen for mutants with altered INO2 expression directly, we constructed an INO2-HIS3 reporter that provides a plate assay for INO2 promoter activity. This reporter was used to isolate mutants (dim1) that fail to repress expression of the INO2 gene in an otherwise wild-type strain. The dim1 mutants contain mutations in the OPI1 gene. To define further the mechanism for Ume6p regulation of INO2 expression, we isolated suppressors (rum1, 2, 3) of the ume6Delta mutation that overexpress the INO2-HIS3 gene. Two of the rum mutant groups contain mutations in the OPI1 and SIN3 genes showing that opi1 and sin3 mutations are epistatic to the ume6Delta mutation. These results are surprising given that Ume6p, Sin3p and Rpd3p are known to form a complex that represses the expression of a diverse set of yeast genes. This prompted us to examine the effect of sin3Delta and rpd3Delta mutants on INO2-cat expression. Surprisingly, the sin3Delta allele overexpressed INO2-cat, whereas the rpd3Delta mutant had no effect. We also show that the UME6 gene does not affect the expression of an OPI1-cat reporter. This suggests that Ume6p does not regulate INO2 expression indirectly by regulating OPI1 expression.  相似文献   

13.
We investigate a model of optimal regulation, intended to describe large-scale differential gene expression. Relations between the optimal expression patterns and the function of genes are deduced from an optimality principle: the regulators have to maximise a fitness function which they influence directly via a cost term, and indirectly via their control on important cell variables, such as metabolic fluxes. According to the model, the optimal linear response to small perturbations reflects the regulators' functions, namely their linear influences on the cell variables. The optimal behaviour can be realised by a linear feedback mechanism. Known or assumed properties of response coefficients lead to predictions about regulation patterns. A symmetry relation predicted for deletion experiments is verified with gene expression data. Where the optimality assumption is valid, our results justify the use of expression data for functional annotation and for pathway reconstruction and suggest the use of linear factor models for the analysis of gene expression data.  相似文献   

14.
In Plasmodium falciparum, dihydrofolate reductase and thymidylate synthase activities are conferred by a single 70-kDa bifunctional polypeptide (DHFR-TS, dihydrofolate reductase-thymidylate synthase) which assembles into a functional 140-kDa homodimer. In mammals, the two enzymes are smaller distinct molecules encoded on different genes. A 27-kDa amino domain of malarial DHFR-TS is sufficient to provide DHFR activity, but the structural requirements for TS function have not been established. Although the 3'-end of DHFR-TS has high homology to TS sequences from other species, expression of this protein fragment failed to yield active TS enzyme, and it failed to complement TS(-) Escherichia coli. Unexpectedly, even partial 5'-deletion of full-length DHFR-TS gene abolished TS function on the 3'-end. Thus, it was hypothesized that the amino end of the bifunctional parasite protein plays an important role in TS function. When the 27-kDa amino domain (DHFR) was provided in trans, a previously inactive 40-kDa carboxyl-domain from malarial DHFR-TS regained its TS function. Physical characterization of the "split enzymes" revealed that the 27- and the 40-kDa fragments of DHFR-TS had reassembled into a 140-kDa hybrid complex. Thus, in malarial DHFR-TS, there are physical interactions between the DHFR domain and the TS domain, and these interactions are necessary to obtain a catalytically active TS. Interference with these essential protein-protein interactions could lead to new selective strategies to treat malaria resistant to traditional DHFR-TS inhibitors.  相似文献   

15.
A data-driven clustering method for time course gene expression data   总被引:1,自引:0,他引:1  
Gene expression over time is, biologically, a continuous process and can thus be represented by a continuous function, i.e. a curve. Individual genes often share similar expression patterns (functional forms). However, the shape of each function, the number of such functions, and the genes that share similar functional forms are typically unknown. Here we introduce an approach that allows direct discovery of related patterns of gene expression and their underlying functions (curves) from data without a priori specification of either cluster number or functional form. Smoothing spline clustering (SSC) models natural properties of gene expression over time, taking into account natural differences in gene expression within a cluster of similarly expressed genes, the effects of experimental measurement error, and missing data. Furthermore, SSC provides a visual summary of each cluster's gene expression function and goodness-of-fit by way of a 'mean curve' construct and its associated confidence bands. We apply this method to gene expression data over the life-cycle of Drosophila melanogaster and Caenorhabditis elegans to discover 17 and 16 unique patterns of gene expression in each species, respectively. New and previously described expression patterns in both species are discovered, the majority of which are biologically meaningful and exhibit statistically significant gene function enrichment. Software and source code implementing the algorithm, SSClust, is freely available (http://genemerge.bioteam.net/SSClust.html).  相似文献   

16.
GESTs (gene expression similarity and taxonomy similarity), a gene functional prediction approach previously proposed by us, is based on gene expression similarity and concept similarity of functional classes defined in Gene Ontology (GO). In this paper, we extend this method to protein-protein interaction data by introducing several methods to filter the neighbors in protein interaction networks for a protein of unknown function(s). Unlike other conventional methods, the proposed approach automatically selects the most appropriate functional classes as specific as possible during the learning process, and calls on genes annotated to nearby classes to support the predictions to some small-sized specific classes in GO. Based on the yeast protein-protein interaction information from MIPS and a dataset of gene expression profiles, we assess the performances of our approach for predicting protein functions to “biology process” by three measures particularly designed for functional classes organized in GO. Results show that our method is powerful for widely predicting gene functions with very specific functional terms. Based on the GO database published in December 2004, we predict some proteins whose functions were unknown at that time, and some of the predictions have been confirmed by the new SGD annotation data published in April, 2006.  相似文献   

17.
MicroRNAs (miRNAs) participate in various biological processes via controlling gene activity. Amphioxus is the best available stand-in as the proximate invertebrate ancestor of the vertebrates. Here, we systematically investigated the miRNAs in amphioxus. First, we identified 245 candidate amphioxus miRNAs, in which 183 miRNAs were firstly reported. Second, we gave evidences to support a birth-and-death process of miRNA genes in some families and gave implications for the functional diversification of miRNA during evolution. Third, we identified 47 development-specific expression miRNAs. We found that only 19 miRNAs were expressed in all developmental stages, 16 miRNAs were neurula-specific and 13 miRNAs were larva-specific. In addition, these potential miRNA-targeting genes were mainly classified into development, muscle formation, cell adhesion, and gene regulation categories. Finally, we found 79 immune related genes targeted by 136 miRNAs in amphioxus. In conclusion, our results take an insight into both the function and evolution of the amphioxus miRNAs.  相似文献   

18.
19.
GESTs (gene expression similarity and taxonomy similarity), a gene functional prediction approach previously proposed by us, is based on gene expression similarity and concept similarity of functional classes defined in Gene Ontology (GO). In this paper, we extend this method to protein-protein interac-tion data by introducing several methods to filter the neighbors in protein interaction networks for a protein of unknown function(s). Unlike other conventional methods, the proposed approach automati-cally selects the most appropriate functional classes as specific as possible during the learning proc-ess, and calls on genes annotated to nearby classes to support the predictions to some small-sized specific classes in GO. Based on the yeast protein-protein interaction information from MIPS and a dataset of gene expression profiles, we assess the performances of our approach for predicting protein functions to “biology process” by three measures particularly designed for functional classes organ-ized in GO. Results show that our method is powerful for widely predicting gene functions with very specific functional terms. Based on the GO database published in December 2004, we predict some proteins whose functions were unknown at that time, and some of the predictions have been confirmed by the new SGD annotation data published in April, 2006.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号