首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
RNA结合蛋白通过特异识别RNA底物发挥重要的生物学作用。指数富集的配体系统进化(Systematic evolution of ligands by exponential enrichment,SELEX)技术是一种体外筛选核酸底物的基本方法,SELEX技术通过重复多轮筛选从随机核酸序列库中筛选出特异性与靶物质高度亲和的核酸底物,本研究将利用该技术与二代高通量测序(NGS)相结合,体外合成含有20个随机碱基的RNA文库,将所要研究的蛋白构建到带有可被链亲和酶素磁珠捕获的SBP标记的载体上去,显著提高筛选效率,仅需1轮筛选即可获得所需RNA底物motif。通过该方法获得了人的hn RNP A1的UP1结构域特异识别AGG和AG二种RNA序列,并通过EMSA实验证实其可以与获得的RNA motif结合。这一方法的建立对于研究RNA结合蛋白识别底物的序列特异性,并进一步了解其在生物体内的调控机制有重要意义。  相似文献   

2.
蔡娟  王建新  李敏  陈钢 《生物信息学》2011,9(3):185-188
生物网络中的聚类分析是功能模块识别及蛋白质功能预测的重要方法,聚类结果的可视化对于快速有效地分析生物网络结构也具有重要作用。通过分析生物网络显示和分析平台Cytoscape的架构,设计了一个使用方便的聚类分析和显示插件ClusterViz。这是一个可扩展的聚类算法的集成平台,可以不断增加其中的聚类算法,并对不同算法的结果进行比较分析,目前已实现了三种典型的算法实例。该插件能够成为蛋白质相互作用网络机理研究的一个有效工具。  相似文献   

3.
遗传算法是模拟生物进化过程的计算模型,是一种全局优化搜索算法。将遗传算法与转录因子结合位点识别问题相结合的新方法,以一致性序列模型作为保守motif的描述模型,通过对motif序列与待测序列的比对问题进行编码,将其转化成搜索空间中的优化问题,利用遗传算法来搜索最优解,预测转录因子的结合位点。实验结果表明,这种新的方法是有效的,它在占用少量内存的情况下能够准确地识别出待测转录因子结合位点。  相似文献   

4.
乙型肝炎是一种十分严重的全球性传染疾病,乙型肝炎病毒(Hepatitis B virus,HBV)是导致乙型肝炎的直接原因。而HBV突变是乙肝病毒进化过程中的一个重要部分,近几年,国内外针对HBV突变进行了广泛研究。但是,对乙肝病毒序列中保守序列的研究为仍处于起步阶段。本文首先采用MEME(Multiple EM for motif elicitation)算法挖掘HBV基序(生物序列中的保守序列片段,即Motif),并提出了一种新的度量标准保守指数(Conserved index,CI),然后对HBV序列进行系统发育分析,最后对构建的系统发育树进行可靠性评价。结果表明,新的度量标准CI可以有效地利用MEME方法挖掘出多个保守序列,进行HBV序列的系统发育树构建,进而分析HBV序列之间的进化关系,并可以找出样本可能的祖先序列。本文的实验方法对HBV大数据集分析方法的研究有积极地启示作用。  相似文献   

5.
盘基网柄菌Dictyostelium discoideum是目前黏菌中研究最清楚的模式生物,其捕食过程与肌动蛋白的多聚化密切相关。为探讨盘基网柄菌肌动蛋白的序列特征,本研究利用生物信息学方法分析了盘基网柄菌32条肌动蛋白的蛋白-蛋白相互作用(protein-protein interaction)和可能含有的保守基序。结果表明:盘基网柄菌32条肌动蛋白与其他蛋白存在一组复杂相互作用关系和5组比较简单的相互作用关系;利用MEME SUITE分别分析盘基网柄菌32条肌动蛋白序列的保守基序和与actin17呈最佳匹配的21种生物肌动蛋白的保守基序,结果共获得6个保守基序,即motif1,motif2,motif3,Motif1,Motif2,Motif3。其中motif1,Motif1,Motif3为本研究新发现的保守基序,这3个保守基序可定位于Profilin-actin-VASP202–244(PDB ID:3CHW)三维结构的重要位置。以上结果表明actin3,actin10,actin14,actin15,actin17,actin31可能为盘基网柄菌比较重要的肌动蛋白;motif1,Motif1,Motif3可能是盘基网柄菌肌动蛋白在进化中重要的保守基序。  相似文献   

6.
在原有的生物大分子序列比对算法的基础上,结合图论中的关健路径法,提出了一种新的计算两寡核苷酸序列间最大配对程度的算法。采用此算法结合生成并测试的方法,能够寻找给定长度的一组适用于DNA计算的寡核苷酸序列。同时采用DNA芯片杂交方法验证了用该算法设计的一组序列的杂交特异性。  相似文献   

7.
一种有效的重复序列识别算法   总被引:1,自引:0,他引:1  
李冬冬  王正志  倪青山 《生物信息学》2005,3(4):163-166,174
重复序列的分析是基因组研究中的一个重要课题,进行这一研究的基础则是从基因组序列中快速有效地找出其中的重复序列。一种投影拼接算法,即利用随机投影获得候选片断集合,利用片断拼接对候选片断进行拼接,以发现基因组中的重复序列。分析了算法的计算复杂度,构造了半仿真测试数据,对算法的测试结果表明了其有效性。  相似文献   

8.
GATA转录因子研究进展   总被引:3,自引:0,他引:3  
GATA家族是一类能识别GATA基序(motif)并与之结合的转录调节因子,其普遍具有锌指结构。GATA家族的共同特点是对一致性序列(T/A)GATA(A/G)具有高度亲合性。其家族成员已发现在动物、真菌、植物等生物中广泛存在。  相似文献   

9.
生物序列拼接及其算法   总被引:1,自引:0,他引:1  
生物序列拼接是鸟枪法(shotgun)测序中的一个重要环节.主要介绍了生物序列拼接及其研究中所涉及的一些基本问题,概述了两类主要的生物序列拼接算法,分析了其各自的特点,并对其进行了比较.  相似文献   

10.
蛋白质折叠识别算法是蛋白质三维结构预测的重要方法之一,该方法在生物科学的许多方面得到卓有成效的应用。在过去的十年中,我们见证了一系列基于不同计算方式的蛋白质折叠识别方法。在这些计算方法中,机器学习和序列谱-序列谱比对是两种在蛋白质折叠中应用较为广泛和有效的方法。除了计算方法的进展外,不断增大的蛋白质结构数据库也是蛋白质折叠识别的预测精度不断提高的一个重要因素。在这篇文章中,我们将简要地回顾蛋白质折叠中的先进算法。另外,我们也将讨论一些可能可以应用于改进蛋白质折叠算法的策略。  相似文献   

11.
SUMMARY: Biological and engineered networks have recently been shown to display network motifs: a small set of characteristic patterns that occur much more frequently than in randomized networks with the same degree sequence. Network motifs were demonstrated to play key information processing roles in biological regulation networks. Existing algorithms for detecting network motifs act by exhaustively enumerating all subgraphs with a given number of nodes in the network. The runtime of such algorithms increases strongly with network size. Here, we present a novel algorithm that allows estimation of subgraph concentrations and detection of network motifs at a runtime that is asymptotically independent of the network size. This algorithm is based on random sampling of subgraphs. Network motifs are detected with a surprisingly small number of samples in a wide variety of networks. Our method can be applied to estimate the concentrations of larger subgraphs in larger networks than was previously possible with exhaustive enumeration algorithms. We present results for high-order motifs in several biological networks and discuss their possible functions. AVAILABILITY: A software tool for estimating subgraph concentrations and detecting network motifs (mfinder 1.1) and further information is available at http://www.weizmann.ac.il/mcb/UriAlon/  相似文献   

12.
Motif discovery methods play pivotal roles in deciphering the genetic regulatory codes (i.e., motifs) in genomes as well as in locating conserved domains in protein sequences. The Expectation Maximization (EM) algorithm is one of the most popular methods used in de novo motif discovery. Based on the position weight matrix (PWM) updating technique, this paper presents a Monte Carlo version of the EM motif-finding algorithm that carries out stochastic sampling in local alignment space to overcome the conventional EM's main drawback of being trapped in a local optimum. The newly implemented algorithm is named as Monte Carlo EM Motif Discovery Algorithm (MCEMDA). MCEMDA starts from an initial model, and then it iteratively performs Monte Carlo simulation and parameter update until convergence. A log-likelihood profiling technique together with the top-k strategy is introduced to cope with the phase shifts and multiple modal issues in motif discovery problem. A novel grouping motif alignment (GMA) algorithm is designed to select motifs by clustering a population of candidate local alignments and successfully applied to subtle motif discovery. MCEMDA compares favorably to other popular PWM-based and word enumerative motif algorithms tested using simulated (l, d)-motif cases, documented prokaryotic, and eukaryotic DNA motif sequences. Finally, MCEMDA is applied to detect large blocks of conserved domains using protein benchmarks and exhibits its excellent capacity while compared with other multiple sequence alignment methods.  相似文献   

13.
Motifs in a given network are small connected subnetworks that occur in significantly higher frequencies than would be expected in random networks. They have recently gathered much attention as a concept to uncover structural design principles of complex networks. Kashtan et al. [Bioinformatics, 2004] proposed a sampling algorithm for performing the computationally challenging task of detecting network motifs. However, among other drawbacks, this algorithm suffers from a sampling bias and scales poorly with increasing subgraph size. Based on a detailed analysis of the previous algorithm, we present a new algorithm for network motif detection which overcomes these drawbacks. Furthermore, we present an efficient new approach for estimating the frequency of subgraphs in random networks that, in contrast to previous approaches, does not require the explicit generation of random networks. Experiments on a testbed of biological networks show our new algorithms to be orders of magnitude faster than previous approaches, allowing for the detection of larger motifs in bigger networks than previously possible and thus facilitating deeper insight into the field  相似文献   

14.
Network motifs are small connected sub-graphs that have recently gathered much attention to discover structural behaviors of large and complex networks. Finding motifs with any size is one of the most important problems in complex and large networks. It needs fast and reliable algorithms and tools for achieving this purpose. CytoKavosh is one of the best choices for finding motifs with any given size in any complex network. It relies on a fast algorithm, Kavosh, which makes it faster than other existing tools. Kavosh algorithm applies some well known algorithmic features and includes tricky aspects, which make it an efficient algorithm in this field. CytoKavosh is a Cytoscape plug-in which supports us in finding motifs of given size in a network that is formerly loaded into the Cytoscape work-space (directed or undirected). High performance of CytoKavosh is achieved by dynamically linking highly optimized functions of Kavosh's C++ to the Cytoscape Java program, which makes this plug-in suitable for analyzing large biological networks. Some significant attributes of CytoKavosh is efficiency in time usage and memory and having no limitation related to the implementation in motif size. CytoKavosh is implemented in a visual environment Cytoscape that is convenient for the users to interact and create visual options to analyze the structural behavior of a network. This plug-in can work on any given network and is very simple to use and generates graphical results of discovered motifs with any required details. There is no specific Cytoscape plug-in, specific for finding the network motifs, based on original concept. So, we have introduced for the first time, CytoKavosh as the first plug-in, and we hope that this plug-in can be improved to cover other options to make it the best motif-analyzing tool.  相似文献   

15.
Subgraph matching algorithms are designed to find all instances of predefined subgraphs in a large graph or network and play an important role in the discovery and analysis of so-called network motifs, subgraph patterns which occur more often than expected by chance. We present the index-based subgraph matching algorithm (ISMA), a novel tree-based algorithm. ISMA realizes a speedup compared to existing algorithms by carefully selecting the order in which the nodes of a query subgraph are investigated. In order to achieve this, we developed a number of data structures and maximally exploited symmetry characteristics of the subgraph. We compared ISMA to a naive recursive tree-based algorithm and to a number of well-known subgraph matching algorithms. Our algorithm outperforms the other algorithms, especially on large networks and with large query subgraphs. An implementation of ISMA in Java is freely available at http://sourceforge.net/projects/isma/.  相似文献   

16.
Protein-protein interaction (PPI) networks of many organisms share global topological features such as degree distribution, k-hop reachability, betweenness and closeness. Yet, some of these networks can differ significantly from the others in terms of local structures: e.g. the number of specific network motifs can vary significantly among PPI networks. Counting the number of network motifs provides a major challenge to compare biomolecular networks. Recently developed algorithms have been able to count the number of induced occurrences of subgraphs with k < or = 7 vertices. Yet no practical algorithm exists for counting non-induced occurrences, or counting subgraphs with k > or = 8 vertices. Counting non-induced occurrences of network motifs is not only challenging but also quite desirable as available PPI networks include several false interactions and miss many others. In this article, we show how to apply the 'color coding' technique for counting non-induced occurrences of subgraph topologies in the form of trees and bounded treewidth subgraphs. Our algorithm can count all occurrences of motif G' with k vertices in a network G with n vertices in time polynomial with n, provided k = O(log n). We use our algorithm to obtain 'treelet' distributions for k < or = 10 of available PPI networks of unicellular organisms (Saccharomyces cerevisiae Escherichia coli and Helicobacter Pyloris), which are all quite similar, and a multicellular organism (Caenorhabditis elegans) which is significantly different. Furthermore, the treelet distribution of the unicellular organisms are similar to that obtained by the 'duplication model' but are quite different from that of the 'preferential attachment model'. The treelet distribution is robust w.r.t. sparsification with bait/edge coverage of 70% but differences can be observed when bait/edge coverage drops to 50%.  相似文献   

17.
Network motifs are statistically overrepresented sub-structures (sub-graphs) in a network, and have been recognized as 'the simple building blocks of complex networks'. Study of biological network motifs may reveal answers to many important biological questions. The main difficulty in detecting larger network motifs in biological networks lies in the facts that the number of possible sub-graphs increases exponentially with the network or motif size (node counts, in general), and that no known polynomial-time algorithm exists in deciding if two graphs are topologically equivalent. This article discusses the biological significance of network motifs, the motivation behind solving the motif-finding problem, and strategies to solve the various aspects of this problem. A simple classification scheme is designed to analyze the strengths and weaknesses of several existing algorithms. Experimental results derived from a few comparative studies in the literature are discussed, with conclusions that lead to future research directions.  相似文献   

18.
19.
Molecular entities work in concert as a system and mediate phenotypic outcomes and disease states. There has been recent interest in modelling the associations between molecular entities from their observed expression profiles as networks using a battery of algorithms. These networks have proven to be useful abstractions of the underlying pathways and signalling mechanisms. Noise is ubiquitous in molecular data and can have a pronounced effect on the inferred network. Noise can be an outcome of several factors including: inherent stochastic mechanisms at the molecular level, variation in the abundance of molecules, heterogeneity, sensitivity of the biological assay or measurement artefacts prevalent especially in high-throughput settings. The present study investigates the impact of discrepancies in noise variance on pair-wise dependencies, conditional dependencies and constraint-based Bayesian network structure learning algorithms that incorporate conditional independence tests as a part of the learning process. Popular network motifs and fundamental connections, namely: (a) common-effect, (b) three-chain, and (c) coherent type-I feed-forward loop (FFL) are investigated. The choice of these elementary networks can be attributed to their prevalence across more complex networks. Analytical expressions elucidating the impact of discrepancies in noise variance on pairwise dependencies and conditional dependencies for special cases of these motifs are presented. Subsequently, the impact of noise on two popular constraint-based Bayesian network structure learning algorithms such as Grow-Shrink (GS) and Incremental Association Markov Blanket (IAMB) that implicitly incorporate tests for conditional independence is investigated. Finally, the impact of noise on networks inferred from publicly available single cell molecular expression profiles is investigated. While discrepancies in noise variance are overlooked in routine molecular network inference, the results presented clearly elucidate their non-trivial impact on the conclusions that in turn can challenge the biological significance of the findings. The analytical treatment and arguments presented are generic and not restricted to molecular data sets.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号