首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background:

Most successful computational approaches for protein function prediction integrate multiple genomics and proteomics data sources to make inferences about the function of unknown proteins. The most accurate of these algorithms have long running times, making them unsuitable for real-time protein function prediction in large genomes. As a result, the predictions of these algorithms are stored in static databases that can easily become outdated. We propose a new algorithm, GeneMANIA, that is as accurate as the leading methods, while capable of predicting protein function in real-time.

Results:

We use a fast heuristic algorithm, derived from ridge regression, to integrate multiple functional association networks and predict gene function from a single process-specific network using label propagation. Our algorithm is efficient enough to be deployed on a modern webserver and is as accurate as, or more so than, the leading methods on the MouseFunc I benchmark and a new yeast function prediction benchmark; it is robust to redundant and irrelevant data and requires, on average, less than ten seconds of computation time on tasks from these benchmarks.

Conclusion:

GeneMANIA is fast enough to predict gene function on-the-fly while achieving state-of-the-art accuracy. A prototype version of a GeneMANIA-based webserver is available at http://morrislab.med.utoronto.ca/prototype.
  相似文献   

2.
Probabilistic association discovery aims at identifying the association between random vectors, regardless of number of variables involved or linear/nonlinear functional forms. Recently, applications in high-dimensional data have generated rising interest in probabilistic association discovery. We developed a framework based on functions on the observation graph, named MeDiA (Mean Distance Association). We generalize its property to a group of functions on the observation graph. The group of functions encapsulates major existing methods in association discovery, e.g. mutual information and Brownian Covariance, and can be expanded to more complicated forms. We conducted numerical comparison of the statistical power of related methods under multiple scenarios. We further demonstrated the application of MeDiA as a method of gene set analysis that captures a broader range of responses than traditional gene set analysis methods.  相似文献   

3.
4.
5.
6.

Background

The B3 DNA binding domain includes five families: auxin response factor (ARF), abscisic acid-insensitive3 (ABI3), high level expression of sugar inducible (HSI), related to ABI3/VP1 (RAV) and reproductive meristem (REM). The release of the complete genomes of the angiosperm eudicots Arabidopsis thaliana and Populus trichocarpa, the monocot Orysa sativa, the bryophyte Physcomitrella patens,the green algae Chlamydomonas reinhardtii and Volvox carteri and the red algae Cyanidioschyzon melorae provided an exceptional opportunity to study the evolution of this superfamily.

Methodology

In order to better understand the origin and the diversification of B3 domains in plants, we combined comparative phylogenetic analysis with exon/intron structure and duplication events. In addition, we investigated the conservation and divergence of the B3 domain during the origin and evolution of each family.

Conclusions

Our data indicate that showed that the B3 containing genes have undergone extensive duplication events, and that the REM family B3 domain has a highly diverged DNA binding. Our results also indicate that the founding member of the B3 gene family is likely to be similar to the ABI3/HSI genes found in C. reinhardtii and V. carteri. Among the B3 families, ABI3, HSI, RAV and ARF are most structurally conserved, whereas the REM family has experienced a rapid divergence. These results are discussed in light of their functional and evolutionary roles in plant development.  相似文献   

7.
8.
Numerous prognostic gene expression signatures for breast cancer were generated previously with few overlap and limited insight into the biology of the disease. Here we introduce a novel algorithm named SCoR (Survival analysis using Cox proportional hazard regression and Random resampling) to apply random resampling and clustering methods in identifying gene features correlated with time to event data. This is shown to reduce overfitting noises involved in microarray data analysis and discover functional gene sets linked to patient survival. SCoR independently identified a common poor prognostic signature composed of cell proliferation genes from six out of eight breast cancer datasets. Furthermore, a sequential SCoR analysis on highly proliferative breast cancers repeatedly identified T/B cell markers as favorable prognosis factors. In glioblastoma, SCoR identified a common good prognostic signature of chromosome 10 genes from two gene expression datasets (TCGA and REMBRANDT), recapitulating the fact that loss of one copy of chromosome 10 (which harbors the tumor suppressor PTEN) is linked to poor survival in glioblastoma patients. SCoR also identified prognostic genes on sex chromosomes in lung adenocarcinomas, suggesting patient gender might be used to predict outcome in this disease. These results demonstrate the power of SCoR to identify common and biologically meaningful prognostic gene expression signatures.  相似文献   

9.
Na+-dependent chloride cotransporters (NKCC1, NKCC2, and NCC) are activated by phosphorylation to play critical roles in diverse physiological responses, including renal salt balance, hearing, epithelial fluid secretion, and volume regulation. Serine threonine kinase WNK4 (With No K = lysine member 4) and members of the Ste20 kinase family, namely SPAK and OSR1 (Ste20-related proline/alanine-rich kinase, Oxidative stress-responsive kinase) govern phosphorylation. According to present understanding, WNK4 phosphorylates key residues within SPAK/OSR1 leading to kinase activation, allowing SPAK/OSR1 to bind to and phosphorylate NKCC1, NKCC2, and NCC. Recently, the calcium-binding protein 39 (Cab39) has emerged as a binding partner and enhancer of SPAK/OSR1 activity, facilitating kinase autoactivation and promoting phosphorylation of the cotransporters. In the present study, we provide evidence showing that Cab39 differentially interacts with WNK4 and SPAK/OSR1 to switch the classic two kinase cascade into a signal kinase transduction mechanism. We found that WNK4 in association with Cab39 activates NKCC1 in a SPAK/OSR1-independent manner. We discovered that WNK4 possesses a domain that bears close resemblance to the SPAK/OSR1 C-terminal CCT/PF2 domain, which is required for physical interaction between the Ste20 kinases and the Na+-driven chloride cotransporters. Modeling, yeast two-hybrid, and functional data reveal that this PF2-like domain located downstream of the catalytic domain in WNK4 promotes the direct interaction between the kinase and NKCC1. We conclude that in addition to SPAK and OSR1, WNK4 is able to anchor itself to the N-terminal domain of NKCC1 and to promote cotransporter activation.  相似文献   

10.
The genes for all cytoplasmic and potentially all mitochondrial aminoacyl-tRNA synthetases (aaRSs) were identified, and all those tested by RNA interference were found to be essential for the growth of Trypanosoma brucei. Some of these enzymes were localized to the cytoplasm or mitochondrion, but most were dually localized to both cellular compartments. Cytoplasmic T. brucei aaRSs were organized in a multiprotein complex in both bloodstream and procyclic forms. The multiple aminoacyl-tRNA synthetase (MARS) complex contained at least six aaRS enzymes and three additional non-aaRS proteins. Steady-state kinetic studies showed that association in the MARS complex enhances tRNA-aminoacylation efficiency, which is in part dependent on a MARS complex-associated protein (MCP), named MCP2, that binds tRNAs and increases their aminoacylation by the complex. Conditional repression of MCP2 in T. brucei bloodstream forms resulted in reduced parasite growth and infectivity in mice. Thus, association in a MARS complex enhances tRNA-aminoacylation and contributes to parasite fitness. The MARS complex may be part of a cellular regulatory system and a target for drug development.  相似文献   

11.
12.
13.
14.
15.
The chloroplast signal recognition particle (cpSRP) and its receptor, chloroplast FtsY (cpFtsY), form an essential complex with the translocase Albino3 (Alb3) during post-translational targeting of light-harvesting chlorophyll-binding proteins (LHCPs). Here, we describe a combination of studies that explore the binding interface and functional role of a previously identified cpSRP43-Alb3 interaction. Using recombinant proteins corresponding to the C terminus of Alb3 (Alb3-Cterm) and various domains of cpSRP43, we identify the ankyrin repeat region of cpSRP43 as the domain primarily responsible for the interaction with Alb3-Cterm. Furthermore, we show Alb3-Cterm dissociates a cpSRP·LHCP targeting complex in vitro and stimulates GTP hydrolysis by cpSRP54 and cpFtsY in a strictly cpSRP43-dependent manner. These results support a model in which interactions between the ankyrin region of cpSRP43 and the C terminus of Alb3 promote distinct membrane-localized events, including LHCP release from cpSRP and release of targeting components from Alb3.  相似文献   

16.
Members of a family of collagen-binding microbial surface components recognizing adhesive matrix molecules (MSCRAMMs) from Gram-positive bacteria are established virulence factors in several infectious diseases models. Here, we report that these adhesins also can bind C1q and act as inhibitors of the classical complement pathway. Molecular analyses of Cna from Staphylococcus aureus suggested that this prototype MSCRAMM bound to the collagenous domain of C1q and interfered with the interactions of C1r with C1q. As a result, C1r2C1s2 was displaced from C1q, and the C1 complex was deactivated. This novel function of the Cna-like MSCRAMMs represents a potential immune evasion strategy that could be used by numerous Gram-positive pathogens.  相似文献   

17.
18.
HS3st1 (heparan sulfate 3-O-sulfotransferase isoform-1) is a critical enzyme involved in the biosynthesis of the antithrombin III (AT)-binding site in the biopharmaceutical drug heparin. Heparin is a highly sulfated glycosaminoglycan that shares a common biosynthetic pathway with heparan sulfate (HS). Although only granulated cells, such as mast cells, biosynthesize heparin, all animal cells are capable of biosynthesizing HS. As part of an effort to bioengineer CHO cells to produce heparin, we previously showed that the introduction of both HS3st1 and NDST2 (N-deacetylase/N-sulfotransferase isoform-2) afforded HS with a very low level of anticoagulant activity. This study demonstrated that untargeted HS3st1 is broadly distributed throughout CHO cells and forms no detectable AT-binding sites, whereas Golgi-targeted HS3st1 localizes in the Golgi and results in the formation of a single type of AT-binding site and high anti-factor Xa activity (137 ± 36 units/mg). Moreover, stable overexpression of HS3st1 also results in up-regulation of 2-O-, 6-O-, and N-sulfo group-containing disaccharides, further emphasizing a previously unknown concerted interplay between the HS biosynthetic enzymes and suggesting the need to control the expression level of all of the biosynthetic enzymes to produce heparin in CHO cells.  相似文献   

19.
With the rapid accumulation of biological omics datasets, decoding the underlying relationships of cross-dataset genes becomes an important issue. Previous studies have attempted to identify differentially expressed genes across datasets. However, it is hard for them to detect interrelated ones. Moreover, existing correlation-based algorithms can only measure the relationship between genes within a single dataset or two multi-modal datasets from the same samples. It is still unclear how to quantify the strength of association of the same gene across two biological datasets with different samples. To this end, we propose Approximate Distance Correlation (ADC) to select interrelated genes with statistical significance across two different biological datasets. ADC first obtains the k most correlated genes for each target gene as its approximate observations, and then calculates the distance correlation (DC) for the target gene across two datasets. ADC repeats this process for all genes and then performs the Benjamini-Hochberg adjustment to control the false discovery rate. We demonstrate the effectiveness of ADC with simulation data and four real applications to select highly interrelated genes across two datasets. These four applications including 21 cancer RNA-seq datasets of different tissues; six single-cell RNA-seq (scRNA-seq) datasets of mouse hematopoietic cells across six different cell types along the hematopoietic cell lineage; five scRNA-seq datasets of pancreatic islet cells across five different technologies; coupled single-cell ATAC-seq (scATAC-seq) and scRNA-seq data of peripheral blood mononuclear cells (PBMC). Extensive results demonstrate that ADC is a powerful tool to uncover interrelated genes with strong biological implications and is scalable to large-scale datasets. Moreover, the number of such genes can serve as a metric to measure the similarity between two datasets, which could characterize the relative difference of diverse cell types and technologies.  相似文献   

20.
Sarcolemmal membrane-associated protein (SLMAP) is a tail-anchored protein involved in fundamental cellular processes, such as myoblast fusion, cell cycle progression, and chromosomal inheritance. Further, SLMAP misexpression is associated with endothelial dysfunctions in diabetes and cancer. SLMAP is part of the conserved striatin-interacting phosphatase and kinase (STRIPAK) complex required for specific signaling pathways in yeasts, filamentous fungi, insects, and mammals. In filamentous fungi, STRIPAK was initially discovered in Sordaria macrospora, a model system for fungal differentiation. Here, we functionally characterize the STRIPAK subunit PRO45, a homolog of human SLMAP. We show that PRO45 is required for sexual propagation and cell-to-cell fusion and that its forkhead-associated (FHA) domain is essential for these processes. Protein-protein interaction studies revealed that PRO45 binds to STRIPAK subunits PRO11 and SmMOB3, which are also required for sexual propagation. Superresolution structured-illumination microscopy (SIM) further established that PRO45 localizes to the nuclear envelope, endoplasmic reticulum, and mitochondria. SIM also showed that localization to the nuclear envelope requires STRIPAK subunits PRO11 and PRO22, whereas for mitochondria it does not. Taken together, our study provides important insights into fundamental roles of the fungal SLMAP homolog PRO45 and suggests STRIPAK-related and STRIPAK-unrelated functions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号