首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background:

The goal of the gene normalization task is to link genes or gene products mentioned in the literature to biological databases. This is a key step in an accurate search of the biological literature. It is a challenging task, even for the human expert; genes are often described rather than referred to by gene symbol and, confusingly, one gene name may refer to different genes (often from different organisms). For BioCreative II, the task was to list the Entrez Gene identifiers for human genes or gene products mentioned in PubMed/MEDLINE abstracts. We selected abstracts associated with articles previously curated for human genes. We provided 281 expert-annotated abstracts containing 684 gene identifiers for training, and a blind test set of 262 documents containing 785 identifiers, with a gold standard created by expert annotators. Inter-annotator agreement was measured at over 90%.

Results:

Twenty groups submitted one to three runs each, for a total of 54 runs. Three systems achieved F-measures (balanced precision and recall) between 0.80 and 0.81. Combining the system outputs using simple voting schemes and classifiers obtained improved results; the best composite system achieved an F-measure of 0.92 with 10-fold cross-validation. A 'maximum recall' system based on the pooled responses of all participants gave a recall of 0.97 (with precision 0.23), identifying 763 out of 785 identifiers.

Conclusion:

Major advances for the BioCreative II gene normalization task include broader participation (20 versus 8 teams) and a pooled system performance comparable to human experts, at over 90% agreement. These results show promise as tools to link the literature with biological databases.
  相似文献   

2.
3.

Background  

The construction of interaction networks between proteins is central to understanding the underlying biological processes. However, since many useful relations are excluded in databases and remain hidden in raw text, a study on automatic interaction extraction from text is important in bioinformatics field.  相似文献   

4.
Extracting protein-protein interaction (PPI) from biomedical literature is an important task in biomedical text mining (BioTM). In this paper, we propose a hash subgraph pairwise (HSP) kernel-based approach for this task. The key to the novel kernel is to use the hierarchical hash labels to express the structural information of subgraphs in a linear time. We apply the graph kernel to compute dependency graphs representing the sentence structure for protein-protein interaction extraction task, which can efficiently make use of full graph structural information, and particularly capture the contiguous topological and label information ignored before. We evaluate the proposed approach on five publicly available PPI corpora. The experimental results show that our approach significantly outperforms all-path kernel approach on all five corpora and achieves state-of-the-art performance.  相似文献   

5.
Recent large-scale studies of protein complexes in yeast have demonstrated that the wide majority of proteins exist in the cell as parts of multicomponent assemblies, mostly novel and of unknown function. The structural and functional analysis of these complexes should be a priority for structural biologists in coming years. In silico methods such as docking simulations, which may contribute to this analysis, are being tested in the CAPRI community-wide experiment, which assesses blind predictions of the structure of protein-protein complexes.  相似文献   

6.
Transferring the biological function of one protein to another is a key issue in understanding the structure and function relationship of proteins. We have developed a strategy for grafting protein-protein interaction epitopes. As a first step, residues at the interface of the ligand protein which strongly interact with the receptor protein were identified. Then protein scaffolds were docked onto receptor protein based on geometric complementarity. Only high docking score matches were saved. For each saved match, the scaffold protein was accepted if it had suitable positions for grafting key interaction residues of the ligand protein. These candidate residues were mutated to corresponding residues in the ligand protein at each relevant position and the mutated scaffold protein was co-minimized with receptor protein. Finally, the minimized complexes were evaluated by a scoring function deduced from statistical analysis of rigid binding data sets. As a test case, the binding epitope of barstar, the inhibitor of barnase, was grafted onto smaller proteins. Pheromone Er-1 (PDB entry 1erc) has been found to be a good scaffold. The calculated binding free energy for mutated Pheromone Er-1 is equivalent to that of barstar.  相似文献   

7.
Comparison of human protein-protein interaction maps   总被引:1,自引:0,他引:1  
MOTIVATION: Large-scale mappings of protein-protein interactions have started to give us new views of the complex molecular mechanisms inside a cell. After initial projects to systematically map protein interactions in model organisms such as yeast, worm and fly, researchers have begun to focus on the mapping of the human interactome. To tackle this enormous challenge, different approaches have been proposed and pursued. While several large-scale human protein interaction maps have recently been published, their quality remains to be critically assessed. RESULTS: We present here a first comparative analysis of eight currently available large-scale maps with a total of over 10,000 unique proteins and 57,000 interactions included. They are based either on literature search, orthology or by yeast-two-hybrid assays. Comparison reveals only a small, but statistically significant overlap. More importantly, our analysis gives clear indications that all interaction maps imply considerable selection and detection biases. These results have to be taken into account for future assembly of the human interactome. AVAILABILITY: An integrated human interaction network called Unified Human Interactome (UniHI) is made publicly accessible at http://www.mdc-berlin.de/unihi. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

8.
9.
We developed a 'computational second-site suppressor' strategy to redesign specificity at a protein-protein interface and applied it to create new specifically interacting DNase-inhibitor protein pairs. We demonstrate that the designed switch in specificity holds in in vitro binding and functional assays. We also show that the designed interfaces are specific in the natural functional context in living cells, and present the first high-resolution X-ray crystallographic analysis of a computer-redesigned functional protein-protein interface with altered specificity. The approach should be applicable to the design of interacting protein pairs with novel specificities for delineating and re-engineering protein interaction networks in living cells.  相似文献   

10.
Current proteomic techniques allow researchers to analyze chosen biological pathways or an ensemble of related protein complexes at a global level via the measure of physical protein-protein interactions by affinity purification mass spectrometry (AP-MS). Such experiments yield information-rich but complex interaction maps whose unbiased interpretation is challenging. Guided by current knowledge on the modular structure of protein complexes, we propose a novel statistical approach, named BI-MAP, complemented by software tools and a visual grammar to present the inferred modules. We show that the BI-MAP tools can be applied from small and very detailed maps to large, sparse, and much noisier data sets. The BI-MAP tool implementation and test data are made freely available.  相似文献   

11.
12.
The functional importance of protein-protein interactions indicates that there should be strong evolutionary constraint on their interaction interfaces. However, binding interfaces are frequently affected by amino acid replacements. Change due to coevolution within interfaces can contribute to variability but is not ubiquitous. An alternative explanation for the ability of surfaces to accept replacements may be that many residues can be changed without affecting the interaction. Candidates for these types of residues are those that make interchain interaction only through the protein main chain, β-carbon, or associated hydrogen atoms. Since almost all residues have these atoms, we hypothesize that this subset of interface residues may be more easily substituted than those that make interactions through other atoms. We term such interactions "residue type independent." Investigating this hypothesis, we find that nearly a quarter of residues in protein interaction interfaces make exclusively interchain residue-type-independent contacts. These residues are less structurally constrained and less conserved than residues making residue-type-specific interactions. We propose that residue-type-independent interactions allow substitutions in binding interfaces while the specificity of binding is maintained.  相似文献   

13.

Background

Currently a huge amount of protein-protein interaction data is available from high throughput experimental methods. In a large network of protein-protein interactions, groups of proteins can be identified as functional clusters having related functions where a single protein can occur in multiple clusters. However experimental methods are error-prone and thus the interactions in a functional cluster may include false positives or there may be unreported interactions. Therefore correctly identifying a functional cluster of proteins requires the knowledge of whether any two proteins in a cluster interact, whether an interaction can exclude other interactions, or how strong the affinity between two interacting proteins is.

Methods

In the present work the yeast protein-protein interaction network is clustered using a spectral clustering method proposed by us in 2006 and the individual clusters are investigated for functional relationships among the member proteins. 3D structural models of the proteins in one cluster have been built – the protein structures are retrieved from the Protein Data Bank or predicted using a comparative modeling approach. A rigid body protein docking method (Cluspro) is used to predict the protein-protein interaction complexes. Binding sites of the docked complexes are characterized by their buried surface areas in the docked complexes, as a measure of the strength of an interaction.

Results

The clustering method yields functionally coherent clusters. Some of the interactions in a cluster exclude other interactions because of shared binding sites. New interactions among the interacting proteins are uncovered, and thus higher order protein complexes in the cluster are proposed. Also the relative stability of each of the protein complexes in the cluster is reported.

Conclusions

Although the methods used are computationally expensive and require human intervention and judgment, they can identify the interactions that could occur together or ones that are mutually exclusive. In addition indirect interactions through another intermediate protein can be identified. These theoretical predictions might be useful for crystallographers to select targets for the X-ray crystallographic determination of protein complexes.
  相似文献   

14.
The scale free structure p(k)-k(-gamma) of protein-protein interaction networks can be reproduced by a static physical model in simulation. We inspect the model theoretically, and find the key reason for the model generating apparent scale free degree distributions. This explanation provides a generic mechanism of 'scale free' networks. Moreover, we predict the dependence of gamma on experimental protein concentrations or other sensitivity factors in detecting interactions, and find experimental evidence to support the prediction.  相似文献   

15.
Rigid-body docking has become quite successful in predicting the correct conformations of binary protein complexes, at least when the constituent proteins do not undergo large conformational changes upon binding. However, determining whether two given proteins interact is a more difficult problem. Successful docking procedures often give equally good scores for proteins that do not interact experimentally. This is the case for the multiple minimization approach we use here. An analysis of the results where all proteins within a set are docked with all other proteins (complete cross-docking) shows that the predictions can be greatly improved if the location of the correct binding interface on each protein is known, since the experimental complexes are much more likely to bring these two interfaces into contact, at the same time as yielding good interaction energy scores. While various methods exist for identifying binding interfaces, it is shown that simply studying the interaction of all potential protein pairs within a data set can itself help to identify the correct interfaces.  相似文献   

16.
A catalog of all human protein-protein interactions would provide scientists with a framework to study protein deregulation in complex diseases such as cancer. Here we demonstrate that a probabilistic analysis integrating model organism interactome data, protein domain data, genome-wide gene expression data and functional annotation data predicts nearly 40,000 protein-protein interactions in humans-a result comparable to those obtained with experimental and computational approaches in model organisms. We validated the accuracy of the predictive model on an independent test set of known interactions and also experimentally confirmed two predicted interactions relevant to human cancer, implicating uncharacterized proteins into definitive pathways. We also applied the human interactome network to cancer genomics data and identified several interaction subnetworks activated in cancer. This integrative analysis provides a comprehensive framework for exploring the human protein interaction network.  相似文献   

17.
Protein-protein interaction networks are useful in contextual annotation of protein function and in general to achieve a system-level understanding of cellular behavior. This work reports on the social behavior of the yeast protein-protein interaction network and concludes that it is non-random. This work, while providing an analysis of organization of genes into functional societies, can potentially be useful in assessing the accuracy of contextual gene annotation based on such interaction networks.  相似文献   

18.
More than 200 proteins copurify with spliceosomes, the compositionally dynamic RNPs catalyzing pre-mRNA splicing. To better understand protein - protein interactions governing splicing, we systematically investigated interactions between human spliceosomal proteins. A comprehensive Y2H interaction matrix screen generated a protein interaction map comprising 632 interactions between 196 proteins. Among these, 242 interactions were found between spliceosomal core proteins and largely validated by coimmunoprecipitation. To reveal dynamic changes in protein interactions, we integrated spliceosomal complex purification information with our interaction data and performed link clustering. These data, together with interaction competition experiments, suggest that during step 1 of splicing, hPRP8 interactions with SF3b proteins are replaced by hSLU7, positioning this second step factor close to the active site, and that the DEAH-box helicases hPRP2 and hPRP16 cooperate through ordered interactions with GPKOW. Our data provide extensive information about the spliceosomal protein interaction network and its dynamics.  相似文献   

19.
The MIPS mammalian protein-protein interaction database   总被引:10,自引:0,他引:10  
SUMMARY: The MIPS mammalian protein-protein interaction database (MPPI) is a new resource of high-quality experimental protein interaction data in mammals. The content is based on published experimental evidence that has been processed by human expert curators. We provide the full dataset for download and a flexible and powerful web interface for users with various requirements.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号