首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Local modeling of global interactome networks   总被引:3,自引:0,他引:3  
MOTIVATION: Systems biology requires accurate models of protein complexes, including physical interactions that assemble and regulate these molecular machines. Yeast two-hybrid (Y2H) and affinity-purification/mass-spectrometry (AP-MS) technologies measure different protein-protein relationships, and issues of completeness, sensitivity and specificity fuel debate over which is best for high-throughput 'interactome' data collection. Static graphs currently used to model Y2H and AP-MS data neglect dynamic and spatial aspects of macromolecular complexes and pleiotropic protein function. RESULTS: We apply the local modeling methodology proposed by Scholtens and Gentleman (2004) to two publicly available datasets and demonstrate its uses, interpretation and limitations. Specifically, we use this technology to address four major issues pertaining to protein-protein networks. (1) We motivate the need to move from static global interactome graphs to local protein complex models. (2) We formally show that accurate local interactome models require both Y2H and AP-MS data, even in idealized situations. (3) We briefly discuss experimental design issues and how bait selection affects interpretability of results. (4) We point to the implications of local modeling for systems biology including functional annotation, new complex prediction, pathway interactivity and coordination with gene-expression data. AVAILABILITY: The local modeling algorithm and all protein complex estimates reported here can be found in the R package apComplex, available at http://www.bioconductor.org CONTACT: dscholtens@northwestern.edu SUPPLEMENTARY INFORMATION: http://daisy.prevmed.northwestern.edu/~denise/pubs/LocalModeling  相似文献   

2.
MOTIVATION: There is not a widely applicable method to determine the sample size for experiments basing statistical significance on the false discovery rate (FDR). RESULTS: We propose and develop the anticipated FDR (aFDR) as a conceptual tool for determining sample size. We derive mathematical expressions for the aFDR and anticipated average statistical power. These expressions are used to develop a general algorithm to determine sample size. We provide specific details on how to implement the algorithm for a k-group (k > or = 2) comparisons. The algorithm performs well for k-group comparisons in a series of traditional simulations and in a real-data simulation conducted by resampling from a large, publicly available dataset. AVAILABILITY: Documented S-plus and R code libraries are freely available from www.stjuderesearch.org/depts/biostats.  相似文献   

3.
MOTIVATION: Gene expression patterns obtained by in situ mRNA hybridization provide important information about different genes during Drosophila embryogenesis. So far, annotations of these images are done by manually assigning a subset of anatomy ontology terms to an image. This time-consuming process depends heavily on the consistency of experts. RESULTS: We develop a system to automatically annotate a fruitfly's embryonic tissue in which a gene has expression. We formulate the task as an image pattern recognition problem. For a new fly embryo image, our system answers two questions: (1) Which stage range does an image belong to? (2) Which annotations should be assigned to an image? We propose to identify the wavelet embryo features by multi-resolution 2D wavelet discrete transform, followed by min-redundancy max-relevance feature selection, which yields optimal distinguishing features for an annotation. We then construct a series of parallel bi-class predictors to solve the multi-objective annotation problem since each image may correspond to multiple annotations. SUPPLEMENTARY INFORMATION: The complete annotation prediction results are available at: http://www.cs.niu.edu/~jzhou/papers/fruitfly and http://research.janelia.org/peng/proj/fly_embryo_annotation/. The datasets used in experiments will be available upon request to the correspondence author.  相似文献   

4.
We report on a major update (version 2) of the original SHort Read Mapping Program (SHRiMP). SHRiMP2 primarily targets mapping sensitivity, and is able to achieve high accuracy at a very reasonable speed. SHRiMP2 supports both letter space and color space (AB/SOLiD) reads, enables for direct alignment of paired reads and uses parallel computation to fully utilize multi-core architectures. AVAILABILITY: SHRiMP2 executables and source code are freely available at: http://compbio.cs.toronto.edu/shrimp/.  相似文献   

5.
rh_tsp_map is a software package for computing radiation hybrid (RH) maps and for integrating physical and genetic maps. It solves the central mapping instances by reducing them to the traveling salesman problem (TSP) and using a modification of the CONCORDE package to solve the TSP instances. We present some of the features added between the initial rh_tsp_map version 1.0 and the current version 3.0, emphasizing the automation of many steps and addition of various checks designed to find problems with the input data. Iterations of improved input data followed by fast re-computation of the maps improves the quality of the final maps. AVAILABILITY: rh_tsp_map source code and documentation including a tutorial is available at ftp://ftp.ncbi.nih.gov/pub/agarwala/rhmapping/rh_tsp_map.tar.gz. CONCORDE modified for RH mapping is available in the directory http://www.isye.gatech.edu/~wcook/rh/. The QSopt library needed for CONCORDE is available at http://www2.isye.gatech.edu/~wcook/qsopt/downloads/downloads.htm  相似文献   

6.
MOTIVATION: G-protein coupled receptors are a major class of eukaryotic cell-surface receptors. A very important aspect of their function is the specific interaction (coupling) with members of four G-protein families. A single GPCR may interact with members of more than one G-protein families (promiscuous coupling). To date all published methods that predict the coupling specificity of GPCRs are restricted to three main coupling groups G(i/o), G(q/11) and G(s), not including G(12/13)-coupled or other promiscuous receptors. RESULTS: We present a method that combines hidden Markov models and a feed-forward artificial neural network to overcome these limitations, while producing the most accurate predictions currently available. Using an up-to-date curated dataset, our method yields a 94% correct classification rate in a 5-fold cross-validation test. The method predicts also promiscuous coupling preferences, including coupling to G(12/13), whereas unlike other methods avoids overpredictions (false positives) when non-GPCR sequences are encountered. AVAILABILITY: A webserver for academic users is available at http://bioinformatics.biol.uoa.gr/PRED-COUPLE2  相似文献   

7.
8.
SUMMARY: We present here the freely available Metabolomics Project resource specifically designed to work under the CcpNmr Analysis program produced by CCPN (Collaborative Computing Project for NMR) (Vranken et al., 2005, The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins, 59, 687-696). The project consists of a database of assigned 1D and 2D spectra of many common metabolites. The project aims to help the user to analyze and assign 1D and 2D NMR spectra of unknown metabolite mixtures. Spectra of unknown mixtures can be easily superimposed and compared with the database spectra, thus facilitating their assignment and identification. AVAILABILITY: The CCPN Metabolomics Project, together with an annotated example dataset, is freely available via: http://www.ccpn.ac.uk/metabolomics/.  相似文献   

9.
10.
ABSTRACT: BACKGROUND: Flux coupling analysis (FCA) has become a useful tool in the constraint-based analysis of genome-scale metabolic networks. FCA allows detecting dependencies between reaction fluxes of metabolic networks at steady-state. On the one hand, this can help in the curation of reconstructed metabolic networks by verifying whether the coupling between reactions is in agreement with the experimental findings. On the other hand, FCA can aid in defining intervention strategiesto knock out target reactions. RESULTS: We present a new method F2C2 for FCA, which is orders of magnitude faster than previous approaches. As a consequence, FCA of genome-scale metabolic networks can now be performed in a routine manner. CONCLUSIONS: We propose F2C2 as a fast tool for the computation of flux coupling in genome-scale metabolic networks. F2C2 is freely available for non-commercial use at https://sourceforge.net/projects/f2c2/files/  相似文献   

11.
12.
13.
We describe a simple software tool, 'matrix2png', for creating color images of matrix data. Originally designed with the display of microarray data sets in mind, it is a general tool that can be used to make simple visualizations of matrices for use in figures, web pages, slide presentations and the like. It can also be used to generate images 'on the fly' in web applications. Both continuous-valued and discrete-valued (categorical) data sets can be displayed. Many options are available to the user, including the colors used, the display of row and column labels, and scale bars. In this note we describe some of matrix2png's features and describe some places it has been useful in the authors' work. AVAILABILITY: A simple web interface is available, and Unix binaries are available from http://microarray.cpmc.columbia.edu/matrix2png. Source code is available on request.  相似文献   

14.
MOTIVATION: Sequence annotations, functional and structural data on snake venom neurotoxins (svNTXs) are scattered across multiple databases and literature sources. Sequence annotations and structural data are available in the public molecular databases, while functional data are almost exclusively available in the published articles. There is a need for a specialized svNTXs database that contains NTX entries, which are organized, well annotated and classified in a systematic manner. RESULTS: We have systematically analyzed svNTXs and classified them using structure-function groups based on their structural, functional and phylogenetic properties. Using conserved motifs in each phylogenetic group, we built an intelligent module for the prediction of structural and functional properties of unknown NTXs. We also developed an annotation tool to aid the functional prediction of newly identified NTXs as an additional resource for the venom research community. AVAILABILITY: We created a searchable online database of NTX proteins sequences (http://research.i2r.a-star.edu.sg/Templar/DB/snake_neurotoxin). This database can also be found under Swiss-Prot Toxin Annotation Project website (http://www.expasy.org/sprot/).  相似文献   

15.
运用磷脂脂肪酸(phospholipid fatty acid,PLFA)和Biolog方法,研究了秸秆不还田不施肥(CK)、秸秆还田+尿素1(N分配:麦收后∶水稻移栽前∶分蘖期∶孕穗期=0∶6∶2∶2,T1)、秸秆还田+尿素2(N分配:麦收后∶水稻移栽前∶分蘖期∶孕穗期=3∶3∶2∶2,T2)、秸秆还田+沼液+尿素(N分配:麦收后∶水稻移栽前∶分蘖期∶孕穗期=3(沼液)∶3(2沼液+1尿素)∶2(尿素)∶2(尿素),T3) 4种氮肥运筹方式对水稻各生育期(分蘖期、孕穗期、成熟期)土壤微生物群落结构的影响。结果表明: 1)T3处理显著提高了各生育期土壤中的有效氮含量,其中成熟期有效氮含量显著高于分蘖期和孕穗期;T1~T3处理的有效磷和速效钾含量在各生育期均高于CK,且分蘖期的含量高于孕穗期和成熟期;稻田各生育期与各处理的交互作用对土壤有效氮、有效磷、速效钾含量均有显著影响;2)T3能提高稻田土壤中微生物碳源代谢强度,碳水化合物、氨基酸、聚合物、羧酸是稻田土壤微生物利用的主要碳源,稻田各生育期与各处理的交互作用对微生物利用碳水化合物和羧酸的能力有显著影响;3)T2、T3能显著提高土壤微生物生物量;T2处理真菌/细菌比较高,以真菌为主导,更有利于稻田土壤生态系统的稳定。表明秸秆还田同步施用氮肥(尿素或沼液)能提高土壤微生物活性,改善土壤环境。  相似文献   

16.
Neurotensin (NT) is a tridecapeptide, hormone in the periphery and neurotransmitter in the brain. We used high-resolution nuclear magnetic resonance (NMR) to resolve the three-dimensional structure of NT in a small unilamellar vesicle (SUV) environment. We demonstrate that if the dynamic of the association–dissociation processes of peptide to SUV binding is rapid enough, structural determination can be obtained by solution NMR experiments. Thus, according to the global dynamic of the system, SUVs seem to be an effective model to mimic biological membranes, especially since the lipid composition can be modified or sterols may be added to closely mimic the biological membranes studied.

An animated Interactive 3D Complement (I3DC) is available in Proteopedia at http://proteopedia.org/w/Journal:JBSD:2  相似文献   

17.
Broadly, computational approaches for ortholog assignment is a three steps process: (i) identify all putative homologs between the genomes, (ii) identify gene anchors and (iii) link anchors to identify best gene matches given their order and context. In this article, we engineer two methods to improve two important aspects of this pipeline [specifically steps (ii) and (iii)]. First, computing sequence similarity data [step (i)] is a computationally intensive task for large sequence sets, creating a bottleneck in the ortholog assignment pipeline. We have designed a fast and highly scalable sort-join method (afree) based on k-mer counts to rapidly compare all pairs of sequences in a large protein sequence set to identify putative homologs. Second, availability of complex genomes containing large gene families with prevalence of complex evolutionary events, such as duplications, has made the task of assigning orthologs and co-orthologs difficult. Here, we have developed an iterative graph matching strategy where at each iteration the best gene assignments are identified resulting in a set of orthologs and co-orthologs. We find that the afree algorithm is faster than existing methods and maintains high accuracy in identifying similar genes. The iterative graph matching strategy also showed high accuracy in identifying complex gene relationships. Standalone afree available from http://vbc.med.monash.edu.au/~kmahmood/afree. EGM2, complete ortholog assignment pipeline (including afree and the iterative graph matching method) available from http://vbc.med.monash.edu.au/~kmahmood/EGM2.  相似文献   

18.
MOTIVATION: Accurate gene structure annotation is a challenging computational problem in genomics. The best results are achieved with spliced alignment of full-length cDNAs or multiple expressed sequence tags (ESTs) with sufficient overlap to cover the entire gene. For most species, cDNA and EST collections are far from comprehensive. We sought to overcome this bottleneck by exploring the possibility of using combined EST resources from fairly diverged species that still share a common gene space. Previous spliced alignment tools were found inadequate for this task because they rely on very high sequence similarity between the ESTs and the genomic DNA. RESULTS: We have developed a computer program, GeneSeqer, which is capable of aligning thousands of ESTs with a long genomic sequence in a reasonable amount of time. The algorithm is uniquely designed to tolerate a high percentage of mismatches and insertions or deletions in the EST relative to the genomic template. This feature allows use of non-cognate ESTs for gene structure prediction, including ESTs derived from duplicated genes and homologous genes from related species. The increased gene prediction sensitivity results in part from novel splice site prediction models that are also available as a stand-alone splice site prediction tool. We assessed GeneSeqer performance relative to a standard Arabidopsis thaliana gene set and demonstrate its utility for plant genome annotation. In particular, we propose that this method provides a timely tool for the annotation of the rice genome, using abundant ESTs from other cereals and plants. AVAILABILITY: The source code is available for download at http://bioinformatics.iastate.edu/bioinformatics2go/gs/download.html. Web servers for Arabidopsis and other plant species are accessible at http://www.plantgdb.org/cgi-bin/AtGeneSeqer.cgi and http://www.plantgdb.org/cgi-bin/GeneSeqer.cgi, respectively. For non-plant species, use http://bioinformatics.iastate.edu/cgi-bin/gs.cgi. The splice site prediction tool (SplicePredictor) is distributed with the GeneSeqer code. A SplicePredictor web server is available at http://bioinformatics.iastate.edu/cgi-bin/sp.cgi  相似文献   

19.
MOTIVATION: The identification of DNA copy number changes provides insights that may advance our understanding of initiation and progression of cancer. Array-based comparative genomic hybridization (array-CGH) has emerged as a technique allowing high-throughput genome-wide scanning for chromosomal aberrations. A number of statistical methods have been proposed for the analysis of array-CGH data. In this article, we consider a fused quantile regression model based on three motivations: (1) quantile regression may provide a more comprehensive picture for the ratio profile of copy numbers than the standard mean regression approach; (2) for simplicity, most available methods assume uniform spacing between neighboring clones, while incorporating the information of physical locations of clones may be helpful and (3) most current methods have a set of tuning parameters that must be carefully tuned, which introduces complexity to the implementation. RESULTS: We formulate the detection of regions of gains and losses in a fused regularized quantile regression framework, incorporating physical locations of clones. We derive an efficient algorithm that computes the entire solution path for the resulting optimization problem, and we propose a simple estimate for the complexity of the fitted model, which leads to convenient selection of the tuning parameter. Three published array-CGH datasets are used to demonstrate our approach. AVAILABILITY: R code are available at http://www.stat.lsa.umich.edu/~jizhu/code/cgh/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

20.
MOTIVATION: Haplotype information has become increasingly important in analyzing fine-scale molecular genetics data, such as disease genes mapping and drug design. Parsimony haplotyping is one of haplotyping problems belonging to NP-hard class. RESULTS: In this paper, we aim to develop a novel algorithm for the haplotype inference problem with the parsimony criterion, based on a parsimonious tree-grow method (PTG). PTG is a heuristic algorithm that can find the minimum number of distinct haplotypes based on the criterion of keeping all genotypes resolved during tree-grow process. In addition, a block-partitioning method is also proposed to improve the computational efficiency. We show that the proposed approach is not only effective with a high accuracy, but also very efficient with the computational complexity in the order of O(m2n) time for n single nucleotide polymorphism sites in m individual genotypes. AVAILABILITY: The software is available upon request from the authors, or from http://zhangroup.aporc.org/bioinfo/ptg/ CONTACT: chen@elec.osaka-sandai.ac.jp SUPPLEMENTARY INFORMATION: Supporting materials is available from http://zhangroup.aporc.org/bioinfo/ptg/bti572supplementary.pdf  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号