首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
MOTIVATION AND RESULTS: Motivated by the recent rise of interest in small regulatory RNAs, we present Locomotif--a new approach for locating RNA motifs that goes beyond the previous ones in three ways: (1) motif search is based on efficient dynamic programming algorithms, incorporating the established thermodynamic model of RNA secondary structure formation. (2) motifs are described graphically, using a Java-based editor, and search algorithms are derived from the graphics in a fully automatic way. The editor allows us to draw secondary structures, annotated with size and sequence information. They closely resemble the established, but informal way in which RNA motifs are communicated in the literature. Thus, the learning effort for Locomotif users is minimal. (3) Locomotif employs a client-server approach. Motifs are designed by the user locally. Search programs are generated and compiled on a bioinformatics server. They are made available both for execution on the server, and for download as C source code plus an appropriate makefile. AVAILABILITY: Locomotif is available at http://bibiserv.techfak.uni-bielefeld.de/locomotif.  相似文献   

2.
This paper presents a language for describing arrangements of motifs in biological sequences, and a program that uses the language to find the arrangements in motif match databases. The program does not by itself search for the constituent motifs, and is thus independent of how they are detected, which allows it to use motif match data of various origins. AVAILABILITY: The program can be tested online at http://hits.isb-sib.ch and the distribution is available from ftp://ftp.isrec.isb-sib.ch/pub/software/unix/mmsearch-1.0.tar.gz CONTACT: Thomas.Junier@isrec.unil.ch SUPPLEMENTARY INFORMATION: The full documentation about mmsearchis available from http://hits.isb-sib.ch/~tjunier/mmsearch/doc.  相似文献   

3.
Many different types of generative models for protein sequences have been proposed in literature. Their uses include the prediction of mutational effects, protein design and the prediction of structural properties. Neural network (NN) architectures have shown great performances, commonly attributed to the capacity to extract non-trivial higher-order interactions from the data. In this work, we analyze two different NN models and assess how close they are to simple pairwise distributions, which have been used in the past for similar problems. We present an approach for extracting pairwise models from more complex ones using an energy-based modeling framework. We show that for the tested models the extracted pairwise models can replicate the energies of the original models and are also close in performance in tasks like mutational effect prediction. In addition, we show that even simpler, factorized models often come close in performance to the original models.  相似文献   

4.

Background  

A structured motif allows variable length gaps between several components, where each component is a simple motif, which allows either no gaps or only fixed length gaps. The motif can either be represented as a pattern or a profile (also called positional weight matrix). We propose an efficient algorithm, called SMOTIF, to solve the structured motif search problem, i.e., given one or more sequences and a structured motif, SMOTIF searches the sequences for all occurrences of the motif. Potential applications include searching for long terminal repeat (LTR) retrotransposons and composite regulatory binding sites in DNA sequences.  相似文献   

5.
6.
7.
Motif detection based on Gibbs sampling is a common procedure used to retrieve regulatory motifs in silico. Using a species-specific background model was previously shown to increase the robustness of the algorithm. Here, we demonstrate that selecting a non-species-adapted background model can have an adverse effect on the results of motif detection. The large differences in the average nucleotide composition of prokaryotic sequences exacerbate the problem of exchanging background models. Therefore, we have developed complex background models for all prokaryotic species with available genome sequences.  相似文献   

8.
The application of molecular replacement (MR) in macromolecular crystallography can be limited by the "model bias" problem. Here we propose a strategy to reduce model bias when only part of a new structure is known: after the MR search, structure determination of the unknown part of the new structure can be facilitated by cross-crystal averaging of the known part of the new structure with the search model. This strategy dramatically improves electron density in the unknown part of the new structure. It has enabled us to determine the structures of two coronavirus receptor-binding domains each complexed with their receptor at moderate resolutions. In a test case, it also enabled automated model building when >50% of an antigen-antibody complex was absent. These results suggest that this averaging strategy can be routinely used after MR to enhance the interpretability of electron density associated with missing model.  相似文献   

9.
10.
Sequence elements, at all levels-DNA, RNA and protein, play a central role in mediating molecular recognition and thereby molecular regulation and signaling. Studies that focus on -measuring and investigating sequence-based recognition make use of statistical and computational tools, including approaches to searching sequence motifs. State-of-the-art motif searching tools are limited in their coverage and ability to address large motif spaces. We develop and present statistical and algorithmic approaches that take as input ranked lists of sequences and return significant motifs. The efficiency of our approach, based on suffix trees, allows searches over motif spaces that are not covered by existing tools. This includes searching variable gap motifs-two half sites with a flexible length gap in between-and searching long motifs over large alphabets. We used our approach to analyze several high-throughput measurement data sets and report some validation results as well as novel suggested motifs and motif refinements. We suggest a refinement of the known estrogen receptor 1 motif in humans, where we observe gaps other than three nucleotides that also serve as significant recognition sites, as well as a variable length motif related to potential tyrosine phosphorylation.  相似文献   

11.
Previous studies from this laboratory demonstrated that N-methylation at Lys(5) residue in somatostatin octapeptide antagonist analogues increased the GH release inhibition potency by as much as 300%. We have now further investigated N-methylation of this Lys(5) residue in conjunction with a number of N- and C-terminal modifications previously found to give highly potent somatostatin receptor antagonists. Synthetic analogues were tested in a functional assay for their ability to inhibit somatostatin-inhibited GH release from rat pituitary cells in culture and to displace 125I-labeled somatostatin from CHO cells transfected with the five known human somatostatin receptors. Several interesting observations resulted from the study. Replacement of liphophilic Nal(8) at the C-terminus with a hydrophilic His(8) resulted in the increased affinity and selectivity for type 2 receptor to give the most potent antagonist analogue yet discovered (K(i), 1.5 nM), although in the rat pituitary cells inhibitory activity on somatostatin inhibited GH release decreased somewhat. A His(3) substitution within the cyclic portion of the analogues retained pituitary cell potency and affinity for type 2 receptor as did substitution with Bip(8) and Fpa(1). Replacement of Cpa(1) with Iph(1) did not effect the affinity for type 2 receptor significantly, but did decrease the effects on rat cell GH release. Iph(3) within-ring substitution increased the selectivity for sst(2) appreciably although the affinity for that receptor was considerably decreased. Substitution of Npa(3) resulted in good selectivity for sst(2) receptor. Replacement of Nal(8) with D-Trp(8) also increased the selectivity for type 2 receptor. Use of a 'bivalent ligand' approach in which two peptides were joined by 4,4'-biphenyldicarbonyl as a spacer destroyed the affinity for all the subtypes, however, the bivalent ligand formed with the Ahp spacer displayed significant affinity and high selectivity for the type 2 receptor.  相似文献   

12.
MOTIVATION: The discovery of patterns shared by several sequences that differ greatly is a basic task in sequence analysis, and still a challenge. Several methods have been developed for detecting patterns. Methods commonly used for motif search include the Gibbs sampler, Expectation-Maximization (EM) algorithm and some intuitive greedy approaches. One cannot guarantee the optimality of the result produced by the Gibbs sampler in a single run. The deterministic EM methods tend to get trapped by local optima. Solutions found by greedy approaches are rarely sufficiently good. RESULTS: A simple model describing a motif or a portion of local multiple sequence alignment is the weight matrix model, in which a motif is characterized with position-specific probabilities. Two substitution matrices are proposed to relate the sequence similarity with the weight matrix. Combining the substitution matrix and weight matrix, we examine three typical sets of protein sequences with increasing complexity. At a low score threshold for pair similarity, sliding windows are compared with a seed window to find the score sum, which provides a measure of statistical significance for multiple sequence comparison. Such a similarity analysis reveals many aspects of motifs. Blocks determined by similarity can be used to deduce a primary weight matrix or an improved substitution matrix. The algorithm successfully obtains the optimal solution for the test sets by just greedy iteration.  相似文献   

13.
14.
A need for menses-inducing drugs as an additional method for the regulation of human fertility still exists. Based on the present knowledge of the regulation of early human pregnancy, potential progesteroneantagonists should induce menstruation at the time of missed menses in a fertile cycle. The experimental models in rats, rabbits and rhesus monkeys, which would allow detection of progesterone antagonists, are described. A potential progesterone antagonist is first assessed for antigestagenic activity in the rabbit. In this test, oestrogens or compounds with oestrogenic properties have antigestagenic effects. Since it is known that oestrogens even in very high doses have no menses-inducing effect in early human pregnancy, such compounds have to be recognized and excluded from further development. In rhesus monkeys, the HCG-prolonged luteal phase was selected as model situation for early pregnancy. It was shown that oestradiol, norethisterone and lynestrenol, compounds which have antigestagenic activity in the rabbit due to their oestrogenic properties, do not induce drug-related menstruation. 11-Deoxycorticosterone and bromacetoxyprogesterone have antigestagenic activity in the rabbit which may be due to competitive progesterone antagonism; these compounds would, however, have too many side effects and are not pursued. No compound with menses-inducing activity has so far been identified; however, the experimental procedure for the study of future active compounds has been well defined.  相似文献   

15.
Many if not all models of disease transmission on networks can be linked to the exact state-based Markovian formulation. However the large number of equations for any system of realistic size limits their applicability to small populations. As a result, most modelling work relies on simulation and pairwise models. In this paper, for a simple SIS dynamics on an arbitrary network, we formalise the link between a well known pairwise model and the exact Markovian formulation. This involves the rigorous derivation of the exact ODE model at the level of pairs in terms of the expected number of pairs and triples. The exact system is then closed using two different closures, one well established and one that has been recently proposed. A new interpretation of both closures is presented, which explains several of their previously observed properties. The closed dynamical systems are solved numerically and the results are compared to output from individual-based stochastic simulations. This is done for a range of networks with the same average degree and clustering coefficient but generated using different algorithms. It is shown that the ability of the pairwise system to accurately model an epidemic is fundamentally dependent on the underlying large-scale network structure. We show that the existing pairwise models are a good fit for certain types of network but have to be used with caution as higher-order network structures may compromise their effectiveness.  相似文献   

16.
17.
18.
19.
The motif DGYW/WRCH (Mh) and its frequently discussed simplified derivative GYW/WRC (Mhs) are involved in immunoglobulin (Ig) hypermutation. Both these motifs appear to be markedly shorter than the corresponding conventionally predicted minima of valid sequence lengths (MVSL). The same conclusion concerning both Mh and Mhs can also be obtained in the combined case including a less strict semi-empirically defined w-value and one nucleotide length tolerance related to MVSL. Such disagreement indicates considerably low information content in Mh and Mhs when evaluating these motifs as alphabetical structures (words). This fact raises a question of actually recognized structures (presumably longer than Mh and Mhs). Interestingly, both Mh and Mhs dimers or pairs of closely located Mh or Mhs achieve confirmation of length validity in the case of w=0.05, suggesting thus double-motif recognition as one of statistically consistent explanations. This possibility is also in agreement with the results of our model sequence study of mRNA derived from variable Ig gene sequences (rIgV) with respect to the most frequently occurring structures formed by motif overlaps in all model sequence sets. On the other hand, additional superior occurrence of motif pairs at a structurally important distance of a single DNA thread was found in the conserved domain (cd00099) related sequences of Elasmobranchii origin and less markedly in the corresponding human rIgV, but not in a randomly selected human subset of rIgV. The data are discussed with respect to statistical evaluation and structural properties of hypermutation motifs or the competent enzyme, i.e. activation-induced cytidine deaminase.  相似文献   

20.
MOTIVATION: Overlapping gene coding sequences (CDSs) are particularly common in viruses but also occur in more complex genomes. Detecting such genes with conventional gene-finding algorithms can be difficult for several reasons. If an overlapping CDS is on the same read-strand as a known CDS, then there may not be a distinct promoter or mRNA. Furthermore, the constraints imposed by double-coding can result in atypical codon biases. However, these same constraints lead to particular mutation patterns that may be detectable in sequence alignments. RESULTS: In this paper, we investigate several statistics for detecting double-coding sequences with pairwise alignments--including a new maximum-likelihood method. We also develop a model for double-coding sequence evolution. Using simulated sequences generated with the model, we characterize the distribution of each statistic as a function of sequence composition, length, divergence time and double-coding frame. Using these results, we develop several algorithms for detecting overlapping CDSs. The algorithms were tested on known overlapping CDSs and other overlapping open reading frames (ORFs) in the hepatitis B virus (HBV), Escherichia coli and Salmonella typhimurium genomes. The algorithms should prove useful for detecting novel overlapping genes--especially short coding ORFs in viruses. AVAILABILITY: Programs may be obtained from the authors. SUPPLEMENTARY INFORMATION: http://biochem.otago.ac.nz/double.html.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号