首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 39 毫秒
1.
MOTIVATION: Protein-protein interaction, mediated by protein interaction sites, is intrinsic to many functional processes in the cell. In this paper, we propose a novel method to discover patterns in protein interaction sites. We observed from protein interaction networks that there exist a kind of significant substructures called interacting protein group pairs, which exhibit an all-versus-all interaction between the two protein-sets in such a pair. The full-interaction between the pair indicates a common interaction mechanism shared by the proteins in the pair, which can be referred as an interaction type. Motif pairs at the interaction sites of the protein group pairs can be used to represent such interaction type, with each motif derived from the sequences of a protein group by standard motif discovery algorithms. The systematic discovery of all pairs of interacting protein groups from large protein interaction networks is a computationally challenging problem. By a careful and sophisticated problem transformation, the problem is solved using efficient algorithms for mining frequent patterns, a problem extensively studied in data mining. RESULTS: We found 5349 pairs of interacting protein groups from a yeast interaction dataset. The expected value of sequence identity within the groups is only 7.48%, indicating non-homology within these protein groups. We derived 5343 motif pairs from these group pairs, represented in the form of blocks. Comparing our motifs with domains in the BLOCKS and PRINTS databases, we found that our blocks could be mapped to an average of 3.08 correlated blocks in these two databases. The mapped blocks occur 4221 out of total 6794 domains (protein groups) in these two databases. Comparing our motif pairs with iPfam consisting of 3045 interacting domain pairs derived from PDB, we found 47 matches occurring in 105 distinct PDB complexes. Comparing with another putative domain interaction database InterDom, we found 203 matches. AVAILABILITY: http://research.i2r.a-star.edu.sg/BindingMotifPairs/resources. SUPPLEMENTARY INFORMATION: http://research.i2r.a-star.edu.sg/BindingMotifPairs and Bioinformatics online.  相似文献   

2.
3.
MOTIVATION: DNA motif finding is one of the core problems in computational biology, for which several probabilistic and discrete approaches have been developed. Most existing methods formulate motif finding as an intractable optimization problem and rely either on expectation maximization (EM) or on local heuristic searches. Another challenge is the choice of motif model: simpler models such as the position-specific scoring matrix (PSSM) impose biologically unrealistic assumptions such as independence of the motif positions, while more involved models are harder to parametrize and learn. RESULTS: We present MotifCut, a graph-theoretic approach to motif finding leading to a convex optimization problem with a polynomial time solution. We build a graph where the vertices represent all k-mers in the input sequences, and edges represent pairwise k-mer similarity. In this graph, we search for a motif as the maximum density subgraph, which is a set of k-mers that exhibit a large number of pairwise similarities. Our formulation does not make strong assumptions regarding the structure of the motif and in practice both motifs that fit well the PSSM model, and those that exhibit strong dependencies between position pairs are found as dense subgraphs. We benchmark MotifCut on both synthetic and real yeast motifs, and find that it compares favorably to existing popular methods. The ability of MotifCut to detect motifs appears to scale well with increasing input size. Moreover, the motifs we discover are different from those discovered by the other methods. AVAILABILITY: MotifCut server and other materials can be found at motifcut.stanford.edu.  相似文献   

4.
酵母核糖体蛋白基因组合转录调控位点统计分析   总被引:1,自引:1,他引:0  
田瑞琴  张静  胡俊 《生物信息学》2010,8(2):127-133
真核基因的转录调控是后基因组时代研究的主要问题之一,其基础是认识DNA上转录因子结合位点(模体)及分布状况。基于马尔可夫链模型对酵母核糖体蛋白基因上游启动子序列中模体出现次数进行统计,利用Z-score统计量抽提出过表达和低表达的模体,其中95%的模体与实验得到的转录因子结合位点相符合。然后将抽提出的模体两两配对,通过与背景序列比较,找出酵母核糖体蛋白基因中出现概率及距离分布均具有统计显著性的模体对,这些非随机出现的模体对具有潜在的组合转录调控功能,其中一些模体对的组合调控作用已有实验支持。对提取出的模体对在序列中的位置分布进行分析,发现近94%的模体对位于转录起始位点上游,超过半数的模体对两模体之间的最短距离在0~100bp之间,距离小于30bp的模体对接近30%,这样的短距离间隔有利于两模体的相同作用。这些结果将有助于对酵母核糖体蛋白基因转录调控机制的深入认识。  相似文献   

5.
Hu J  Hu H  Li X 《Nucleic acids research》2008,36(13):4488-4497
The identification of cis-regulatory modules (CRMs) can greatly advance our understanding of eukaryotic regulatory mechanism. Current methods to predict CRMs from known motifs either depend on multiple alignments or can only deal with a small number of known motifs provided by users. These methods are problematic when binding sites are not well aligned in multiple alignments or when the number of input known motifs is large. We thus developed a new CRM identification method MOPAT (motif pair tree), which identifies CRMs through the identification of motif modules, groups of motifs co-occurring in multiple CRMs. It can identify 'orthologous' CRMs without multiple alignments. It can also find CRMs given a large number of known motifs. We have applied this method to mouse developmental genes, and have evaluated the predicted CRMs and motif modules by microarray expression data and known interacting motif pairs. We show that the expression profiles of the genes containing CRMs of the same motif module correlate significantly better than those of a random set of genes do. We also show that the known interacting motif pairs are significantly included in our predictions. Compared with several current methods, our method shows better performance in identifying meaningful CRMs.  相似文献   

6.
MOTIVATION: Discovery of binding sites is important in the study of protein-protein interactions. In this paper, we introduce stable and significant motif pairs to model protein-binding sites. The stability is the pattern's resistance to some transformation. The significance is the unexpected frequency of occurrence of the pattern in a sequence dataset comprising known interacting protein pairs. Discovery of stable motif pairs is an iterative process, undergoing a chain of changing but converging patterns. Determining the starting point for such a chain is an interesting problem. We use a protein complex dataset extracted from the Protein Data Bank to help in identifying those starting points, so that the computational complexity of the problem is much released. RESULTS: We found 913 stable motif pairs, of which 765 are significant. We evaluated these motif pairs using comprehensive comparison results against random patterns. Wet-experimentally discovered motifs reported in the literature were also used to confirm the effectiveness of our method. SUPPLEMENTARY INFORMATION: http://sdmc.i2r.a-star.edu.sg/BindingMotifPairs.  相似文献   

7.
研究表明,第一内含子可能参与基因转录调控.利用统计方法提取人管家基因上游至第一内含子序列中潜在的组合转录调控模体,分析模体间的距离、区域分布等特征,探讨内含子参与基因转录调控的可能性及其参与方式.在管家基因中共获得960对潜在转录调控模体对,其中57%与实验已知的具有转录相互作用的因子对吻合,共涉及12组因子对.分析发现,绝大多数模体对(80%)偏向于上游区域及"上游-内含子"区域,进一步支持了内含子参与基因转录调控的假设,并据此推测内含子与上游序列之间具有转录协同作用,模体在基因转录起始位点(TSS)附近较为集中,模体对的两个模体之间距离较近,60%左右距离在200 bp以内,特别地,65%的模体对特征距离在100 bp以内,短距离间隔有利于转录因子间的协同作用.这些结果将有助于对人基因转录调控机制及内含子功能的深入认识.  相似文献   

8.
Sequence-specific high mobility group (HMG) box factors bind and bend DNA via interactions in the minor groove. Three-dimensional NMR analyses have provided the structural basis for this interaction. The cognate HMG domain DNA motif is generally believed to span 6-8 bases. However, alignment of promoter elements controlled by the yeast genes ste11 and Rox1 has indicated strict conservation of a larger DNA motif. By site selection, we identify a highly specific 12-base pair motif for Ste11, AGAACAAAGAAA. Similarly, we show that Tcf1, MatMc, and Sox4 bind unique, highly specific DNA motifs of 12, 12, and 10 base pairs, respectively. Footprinting with a deletion mutant of Ste11 reveals a novel interaction between the 3' base pairs of the extended DNA motif and amino acids C-terminal to the HMG domain. The sequence-specific interaction of Ste11 with these 3' base pairs contributes significantly to binding and bending of the DNA motif.  相似文献   

9.
10.
11.
Complex brains have evolved a highly efficient network architecture whose structural connectivity is capable of generating a large repertoire of functional states. We detect characteristic network building blocks (structural and functional motifs) in neuroanatomical data sets and identify a small set of structural motifs that occur in significantly increased numbers. Our analysis suggests the hypothesis that brain networks maximize both the number and the diversity of functional motifs, while the repertoire of structural motifs remains small. Using functional motif number as a cost function in an optimization algorithm, we obtain network topologies that resemble real brain networks across a broad spectrum of structural measures, including small-world attributes. These results are consistent with the hypothesis that highly evolved neural architectures are organized to maximize functional repertoires and to support highly efficient integration of information.  相似文献   

12.
13.
Recent studies have shown that RNA structural motifs play essential roles in RNA folding and interaction with other molecules. Computational identification and analysis of RNA structural motifs remains a challenging task. Existing motif identification methods based on 3D structure may not properly compare motifs with high structural variations. Other structural motif identification methods consider only nested canonical base-pairing structures and cannot be used to identify complex RNA structural motifs that often consist of various non-canonical base pairs due to uncommon hydrogen bond interactions. In this article, we present a novel RNA structural alignment method for RNA structural motif identification, RNAMotifScan, which takes into consideration the isosteric (both canonical and non-canonical) base pairs and multi-pairings in RNA structural motifs. The utility and accuracy of RNAMotifScan is demonstrated by searching for kink-turn, C-loop, sarcin-ricin, reverse kink-turn and E-loop motifs against a 23S rRNA (PDBid: 1S72), which is well characterized for the occurrences of these motifs. Finally, we search these motifs against the RNA structures in the entire Protein Data Bank and the abundances of them are estimated. RNAMotifScan is freely available at our supplementary website (http://genome.ucf.edu/RNAMotifScan).  相似文献   

14.
MOTIVATION: Much research has been devoted to the characterization of interaction interfaces found in complexes with known structure. In this context, the interactions of non-homologous domains at equivalent binding sites are of particular interest, as they can reveal convergently evolved interface motifs. Such motifs are an important source of information to formulate rules for interaction specificity and to design ligands based on the common features shared among diverse partners. RESULTS: We develop a novel method to identify non-homologous structural domains which bind at equivalent sites when interacting with a common partner. We systematically apply this method to all pairs of interactions with known structure and derive a comprehensive database for these interactions. Of all non-homologous domains, which bind with a common interaction partner, 4.2% use the same interface of the common interaction partner (excluding immunoglobulins and proteases). This rises to 16% if immunoglobulin and proteases are included. We demonstrate two applications of our database: first, the systematic screening for viral protein interfaces, which can mimic native interfaces and thus interfere; and second, structural motifs in enzymes and its inhibitors. We highlight several cases of virus protein mimicry: viral M3 protein interferes with a chemokine dimer interface. The virus has evolved the motif SVSPLP, which mimics the native SSDTTP motif. A second example is the regulatory factor Nef in HIV which can mimic a kinase when interacting with SH3. Among others the virus has evolved the kinase's PxxP motif. Further, we elucidate motif resemblances in Baculovirus p35 and HIV capsid proteins. Finally, chymotrypsin is subject to scrutiny wrt. its structural similarity to subtilisin and wrt. its inhibitor's similar recognition sites. SUPPLEMENTARY INFORMATION: A database is online at scoppi.biotec.tu-dresden.de/abac/.  相似文献   

15.
Soybean mosaic virus (SMV), a member of the genus Potyvirus , is transmitted by aphids in a non-persistent manner. It has been well documented that the helper component-proteinase (HC-Pro) plays a role as a 'bridge' between virion particles and aphid stylets in the aphid transmission of potyviruses. Several motifs, including the KITC and PTK motifs on HC-Pro and the DAG motif on the coat protein (CP), have been found to be involved in aphid transmission. Previously, we have shown strong interaction between SMV CP and HC-Pro in a yeast two-hybrid system (YTHS). In this report, we further analysed this CP–HC-Pro interaction based on YTHS and an in vivo binding assay to identify crucial amino acid residues for this interaction. Through this genetic approach, we identified two additional amino acid residues (H256 on CP and R455 on HC-Pro), as well as G12 on the DAG motif, crucial for the CP–HC-Pro interaction. We introduced mutations into the identified residues using an SMV infectious clone and showed that these mutations affected the efficiency of aphid transmission of SMV. We also investigated the involvement of the PTK and DAG motifs in the CP–HC-Pro interaction and aphid transmission of SMV. Our results support the concept that physical interaction between CP and HC-Pro is important for potyviral aphid transmission. Based on the combination of our current results with previous findings, the possibility that aphid transmission may be regulated by more complex molecular interactions than the simple involvement of HC-Pro as a bridge is discussed.  相似文献   

16.
MOTIVATION: Identification of motifs is one of the critical stages in studying the regulatory interactions of genes. Motifs can have complicated patterns. In particular, spaced motifs, an important class of motifs, consist of several short segments separated by spacers of different lengths. Locating spaced motifs is not trivial. Existing motif-finding algorithms are either designed for monad motifs (short contiguous patterns with some mismatches) or have assumptions on the spacer lengths or can only handle at most two segments. An effective motif finder for generic spaced motifs is highly desirable. RESULTS: This article proposes a novel approach for identifying spaced motifs with any number of spacers of different lengths. We introduce the notion of submotifs to capture the segments in the spaced motif and formulate the motif-finding problem as a frequent submotif mining problem. We provide an algorithm called SPACE to solve the problem. Based on experiments on real biological datasets, synthetic datasets and the motif assessment benchmarks by Tompa et al., we show that our algorithm performs better than existing tools for spaced motifs with improvements in both sensitivity and specificity and for monads, SPACE performs as good as other tools. AVAILABILITY: The source code is available upon request from the authors.  相似文献   

17.
The sequenced genomes of oomycete plant pathogens contain large superfamilies of effector proteins containing the protein translocation motif RXLR-dEER. However, the contributions of these effectors to pathogenicity remain poorly understood. Here, we show that the Phytophthora sojae effector protein Avr1b can contribute positively to virulence and can suppress programmed cell death (PCD) triggered by the mouse BAX protein in yeast, soybean (Glycine max), and Nicotiana benthamiana cells. We identify three conserved motifs (K, W, and Y) in the C terminus of the Avr1b protein and show that mutations in the conserved residues of the W and Y motifs reduce or abolish the ability of Avr1b to suppress PCD and also abolish the avirulence interaction of Avr1b with the Rps1b resistance gene in soybean. W and Y motifs are present in at least half of the identified oomycete RXLR-dEER effector candidates, and we show that three of these candidates also suppress PCD in soybean. Together, these results indicate that the W and Y motifs are critical for the interaction of Avr1b with host plant target proteins and support the hypothesis that these motifs are critical for the functions of the very large number of predicted oomycete effectors that contain them.  相似文献   

18.
Sequence variation in a widespread, recurrent, structured RNA 3D motif, the Sarcin/Ricin (S/R), was studied to address three related questions: First, how do the stabilities of structured RNA 3D motifs, composed of non-Watson–Crick (non-WC) basepairs, compare to WC-paired helices of similar length and sequence? Second, what are the effects on the stabilities of such motifs of isosteric and non-isosteric base substitutions in the non-WC pairs? And third, is there selection for particular base combinations in non-WC basepairs, depending on the temperature regime to which an organism adapts? A survey of large and small subunit rRNAs from organisms adapted to different temperatures revealed the presence of systematic sequence variations at many non-WC paired sites of S/R motifs. UV melting analysis and enzymatic digestion assays of oligonucleotides containing the motif suggest that more stable motifs tend to be more rigid. We further found that the base substitutions at non-Watson–Crick pairing sites can significantly affect the thermodynamic stabilities of S/R motifs and these effects are highly context specific indicating the importance of base-stacking and base-phosphate interactions on motif stability. This study highlights the significance of non-canonical base pairs and their contributions to modulating the stability and flexibility of RNA molecules.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号