首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
New human and mouse microRNA genes found by homology search   总被引:2,自引:0,他引:2  
Weber MJ 《The FEBS journal》2005,272(1):59-73
  相似文献   

2.
A new method for homology search of DNA sequences is suggested. This method may be used to find extensive and not strong homologies with point mutations and deletions. The running program time for comparing sequences is less then the dynamic program algorithms at least at two orders of magnitude. It makes possible to use the method for homology searching throughover the nucleotide bank by personal computers.  相似文献   

3.
4.
A model for an antibody specific for the carcinoembryonic antigen (CEA) has been constructed using a method which combines the concept of canonical structures with conformational search. A conformational search technique is introduced which couples random generation of backbone loop conformations to a simulated annealing method for assigning side chain conformations. This technique was used both to verify conformations selected from the set of known canonical structures and to explore conformations available to the H3 loop in CEA ab initio. Canonical structures are not available for H3 due to its variability in length, sequence, and observed conformation in known antibody structures. Analysis of the results of conformational search resulted in three equally probable conformations for H3 loop in CEA. Force field energies, solvation free energies, exposure of charged residues and burial of hydrophobic residues, and packing of hydrophobic residues at the base of the loop were used as selection criteria. The existence of three equally plausible structures may reflect the high degree of flexibility expected for an exposed loop of this length. The nature of the combining site and features which could be important to interaction with antigen are discussed.  相似文献   

5.
PfamAlyzer is a Java applet that enables exploration of Pfam domain architectures using a user-friendly graphical interface. It can search the UniProt protein database for a domain pattern. Domain patterns similar to the query are presented graphically by PfamAlyzer either in a ranked list or pinned to the tree of life. Such domain-centric homology search can assist identification of distant homologs with shared domain architecture. AVAILABILITY: PfamAlyzer has been integrated with the Pfam database and can be accessed at http://pfam.cgb.ki.se/pfamalyzer.  相似文献   

6.
We are interested in detecting homologous genomic DNA sequences with the goal of locating approximate inverted, interspersed, and tandem repeats. Standard search techniques start by detecting small matching parts, called seeds, between a query sequence and database sequences. Contiguous seed models have existed for many years. Recently, spaced seeds were shown to be more sensitive than contiguous seeds without increasing the random hit rate. To determine the superiority of one seed model over another, a model of homologous sequence alignment must be chosen. Previous studies evaluating spaced and contiguous seeds have assumed that matches and mismatches occur within these alignments, but not insertions and deletions (indels). This is perhaps appropriate when searching for protein coding sequences (<5% of the human genome), but is inappropriate when looking for repeats in the majority of genomic sequence where indels are common. In this paper, we assume a model of homologous sequence alignment which includes indels and we describe a new seed model, called indel seeds, which explicitly allows indels. We present a waiting time formula for computing the sensitivity of an indel seed and show that indel seeds significantly outperform contiguous and spaced seeds when homologies include indels. We discuss the practical aspect of using indel seeds and finally we present results from a search for inverted repeats in the dog genome using both indel and spaced seeds.  相似文献   

7.
Bacillus subtilis RecO plays a central role in recombinational repair and genetic recombination by (i) stimulating RecA filamentation onto SsbA-coated single-stranded (ss) DNA, (ii) modulating the extent of RecA-mediated DNA strand exchange and (iii) promoting annealing of complementary DNA strands. Here, we report that RecO-mediated strand annealing is facilitated by cognate SsbA, but not by a heterologous one. Analysis of non-productive intermediates reveals that RecO interacts with SsbA-coated ssDNA, resulting in transient ternary complexes. The self-interaction of ternary complexes via RecO led to the formation of large nucleoprotein complexes. In the presence of homology, SsbA, at the nucleoprotein, removes DNA secondary structures, inhibits spontaneous strand annealing and facilitates RecO loading onto SsbA–ssDNA complex. RecO relieves SsbA inhibition of strand annealing and facilitates transient and random interactions between homologous naked ssDNA molecules. Finally, both proteins lose affinity for duplex DNA. Our results provide a mechanistic framework for rationalizing protein release and dsDNA zippering as coordinated events that are crucial for RecA-independent plasmid transformation.  相似文献   

8.
MOTIVATION: Homology search finds similar segments between two biological sequences, such as DNA or protein sequences. The introduction of optimal spaced seeds in PatternHunter has increased both the sensitivity and the speed of homology search, and it has been adopted by many alignment programs such as BLAST. With the further improvement provided by multiple spaced seeds in PatternHunterII, Smith-Waterman sensitivity is approached at BLASTn speed. However, computing optimal multiple spaced seeds was proved to be NP-hard and current heuristic algorithms are all very slow (exponential). RESULTS: We give a simple algorithm which computes good multiple seeds in polynomial time. Due to a completely different approach, the difference with respect to the previous methods is dramatic. The multiple spaced seed of PatternHunterII, with 16 weight 11 seeds, was computed in 12 days. It takes us 17 s to find a better one. Our approach changes the way of looking at multiple spaced seeds.  相似文献   

9.
MOTIVATION: Homology search for RNAs can use secondary structure information to increase power by modeling base pairs, as in covariance models, but the resulting computational costs are high. Typical acceleration strategies rely on at least one filtering stage using sequence-only search. RESULTS: Here we present the multi-segment CYK (MSCYK) filter, which implements a heuristic of ungapped structural alignment for RNA homology search. Compared to gapped alignment, this approximation has lower computation time requirements (O(N?) reduced to O(N3), and space requirements (O(N3) reduced to O(N2). A vector-parallel implementation of this method gives up to 100-fold speed-up; vector-parallel implementations of standard gapped alignment at two levels of precision give 3- and 6-fold speed-ups. These approaches are combined to create a filtering pipeline that scores RNA secondary structure at all stages, with results that are synergistic with existing methods.  相似文献   

10.
MOTIVATION: Filtration is an important technique used to speed up local alignment as exemplified in the BLAST programs. Recently, Ma et al. discovered that better filtering can be achieved by spacing out the matching positions according to a certain pattern, instead of contiguous positions to trigger a local alignment in their PatternHunter program. Such a match pattern is called a spaced seed. RESULTS: Our numerical computation shows that the ranks of spaced seeds (based on sensitivity) change with the sequences similarity. Since homologous sequences may have diverse similarity, we assess the sensitivity of spaced seeds over a range of similarity levels and present a list of good spaced seeds for facilitating homology search in DNA genomic sequences. We validate that the listed spaced seeds are indeed more sensitive using three arbitrarily chosen pairs of DNA genomic sequences.  相似文献   

11.
12.
Many species of pseudomonads produce fluorescent siderophores involved in iron uptake. We have investigated the DNA homology between the siderophore synthesis genes of an opportunist animal pathogen, Pseudomonas aeruginosa, and three plant-associated species Pseudomonas syringae, Pseudomonas putida and Pseudomonas sp. B10. There is extensive homology between the DNA from the different species, consistent with the suggestion that the different siderophore synthesis genes have evolved from the same ancestral set of genes. The existence of DNA homology allowed us to clone some of the siderophore synthesis genes from P. aeruginosa, and genetic mapping indicates that the cloned DNA lies in a locus previously identified as being involved in siderophore production.  相似文献   

13.
It has been observed that in homology search gapped seeds have better sensitivity than ungapped ones for the same cost (weight). In this paper, we propose a probability leakage model (a dissipative Markov system) to elucidate the mechanism that confers power to spaced seeds. Based on this model, we identify desirable features of gapped search seeds and formulate an extremely efficient procedure for seed design: it samples from the set of spaced seed exhibiting those features, evaluates their sensitivity, and then selects the best. The sensitivity of the constructed seeds is negligibly less than that of the corresponding known optimal seeds. While the challenging mathematical question of characterizing optimal search seeds remains open, we believe that our eminently efficient and effective approach represents a satisfactory solution from a practitioner's viewpoint.  相似文献   

14.
We present a framework for improving local protein alignment algorithms. Specifically, we discuss how to extend local protein aligners to use a collection of vector seeds or ungapped alignment seeds to reduce noise hits. We model picking a set of seed models as an integer programming problem and give algorithms to choose such a set of seeds. While the problem is NP-hard, and Quasi-NP-hard to approximate to within a logarithmic factor, it can be solved easily in practice. A good set of seeds we have chosen allows four to five times fewer false positive hits, while preserving essentially identical sensitivity as BLASTP.  相似文献   

15.
In homology search, good spaced seeds have higher sensitivity for the same cost (weight). However, elucidating the mechanism that confers power to spaced seeds and characterizing optimal spaced seeds still remain unsolved. This paper investigates these two important open questions by formally analyzing the average number of non-overlapping hits and the hit probability of a spaced seed in the Bernoulli sequence model. We prove that when the length of a non-uniformly spaced seed is bounded above by an exponential function of the seed weight, the seed outperforms strictly the traditional consecutive seed of the same weight in both 1) the average number of non-overlapping hits and 2) the asymptotic hit probability. This clearly answers the first problem mentioned above in the Bernoulli sequence model. The theoretical study in this paper also gives a new solution to finding long optimal seeds.  相似文献   

16.
Fu SY  Zhao DC  Zhao HL  Li JQ  Zhang WG 《遗传》2012,34(7):919-926
文章旨在建立一种种子序列介导的可控遗传操作—microRNA靶基因指纹图谱(MicroRNA targets fingerprint,MTFP),用于在基因表达检测中筛选与特定microRNA相关的靶基因。在设定上游种子序列的互补序列和下游锚定序列的基础上添加特殊接头,通过反转录和特殊二步PCR将microRNA的靶基因扩增;扩增后的microRNA靶基因在聚丙烯酰胺凝胶电泳中检测其片段大小和表达丰度,用于筛选在不同生理状态或试验条件下特异表达的基因;特定的靶基因序列通过DNA回收和测序方法得到。以miR-203为例,在不同生理状态的山羊皮肤样品中获得了5条大小分别为718 bp(JN709494)、349 bp(JN709495)、243 bp(JN709496)、156 bp(JN709497)和97 bp(JN709498)的靶基因序列。MTFP经济适用、可操作性强,可用于探索microRNA调节的靶基因,或用来评估靶基因的表达谱特征。  相似文献   

17.
18.
To initiate homologous recombination, sequence similarity between two DNA molecules must be searched for and homology recognized. How the search for and recognition of homology occurs remains unproven. We have examined the influences of DNA topology and the polarity of RecA–single-stranded (ss)DNA filaments on the formation of synaptic complexes promoted by RecA. Using two complementary methods and various ssDNA and duplex DNA molecules as substrates, we demonstrate that topological constraints on a small circular RecA–ssDNA filament prevent it from interwinding with its duplex DNA target at the homologous region. We were unable to detect homologous pairing between a circular RecA–ssDNA filament and its relaxed or supercoiled circular duplex DNA targets. However, the formation of synaptic complexes between an invading linear RecA–ssDNA filament and covalently closed circular duplex DNAs is promoted by supercoiling of the duplex DNA. The results imply that a triplex structure formed by non-Watson–Crick hydrogen bonding is unlikely to be an intermediate in homology searching promoted by RecA. Rather, a model in which RecA-mediated homology searching requires unwinding of the duplex DNA coupled with local strand exchange is the likely mechanism. Furthermore, we show that polarity of the invading RecA–ssDNA does not affect its ability to pair and interwind with its circular target duplex DNA.  相似文献   

19.
MOTIVATION: The expression of genes during the cell division process has now been studied in many different species. An important goal of these studies is to identify the set of cycling genes. To date, this was done independently for each of the species studied. Due to noise and other data analysis problems, accurately deriving a set of cycling genes from expression data is a hard problem. This is especially true for some of the multicellular organisms, including humans. RESULTS: Here we present the first algorithm that combines microarray expression data from multiple species for identifying cycling genes. Our algorithm represents genes from multiple species as nodes in a graph. Edges between genes represent sequence similarity. Starting with the measured expression values for each species we use Belief Propagation to determine a posterior score for genes. This posterior is used to determine a new set of cycling genes for each species. We applied our algorithm to improve the identification of the set of cell cycle genes in budding yeast and humans. As we show, by incorporating sequence similarity information we were able to obtain a more accurate set of genes compared to methods that rely on expression data alone. Our method was especially successful for the human dataset indicating that it can use a high quality dataset from one species to overcome noise problems in another. AVAILABILITY: C implementation is available from the supporting website: http://www.cs.cmu.edu/~lyongu/pub/cellcycle/.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号