首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background  

DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity.  相似文献   

2.
3.
Green fluorescent protein from Aequorea victoria and its many homologs are now widely used in basic and applied research. These genetically encoded fluorescent markers can detect localization of cell proteins and organelles in living cells and also cells and tissues in living organisms. Unique instruments and methods for studies of molecular biology of a cell and high throughput drug screenings are based on fluorescent proteins. This review deals with the most intensively evolving directions in this field, the development of genetically encoded sensors. Changes in their spectral properties are used for monitoring of cell enzyme activities or changes in concentrations of particular molecules.  相似文献   

4.
MOTIVATION: Current software tools are moderately effective in predicting genetic structure (exons, introns, intergenic regions, and complete genes) from raw DNA sequence data. Improvements in accuracy and speed are needed to deal with the increasing volume of data from large scale sequencing projects. RESULTS: We present a two-stage computer program to predict genetic structure in eukaryotic DNA. The first stage makes use of a novel statistical technique, called reference point logistic (RPL) regression, to calculate scores for potential functional sites. These site scores are combined with interval content, length, and state scores, via a Generalized Hidden Markov Model, to determine a combined score for each possible parse of a given DNA sequence into exons, introns, and intergenic regions. An optimal parse is found using a dynamic programming algorithm. In the second stage, protein sequence alignment methods are applied to improve the accuracy of the initial parse. Computation in the first stage of the program is very fast (1 s on a 360 MHz CPU for a 16 kb sequence) and its predictive accuracy typically matches or exceeds the best results reported for other methods (Sensitivity = 0.93 and Specificity = 0.93 for the Burset/Guigótest set). Computation in the second stage is slower, but the final predictions are more accurate (Sn = 0.97, Sp = 0.97). The program (called GRPL) can handle partial, single, and multi-gene sequences. The program is also capable of predicting the genetic structure of vertebrate, invertebrate, and plant DNA with nearly equal accuracy. Statistical techniques have also been introduced to model the effects of varying C+G content in a continuous manner and to control overfitting of parameters for smaller training sets. AVAILABILITY: An academic implementation of GRPL, compiled for SUN workstations, is available by anonymous ftp from snipe.pharmacy. ualberta.ca/pub. The training and test sets used in this work, together with supplementary material, can be found at the same location. A commercial implementation is available as a component of GeneTool (BioTools Inc., http://biotools.com).  相似文献   

5.
6.

Background  

Chimera proteins are widely used for the analysis of the protein-protein interaction region. One of the major issues is the epitope analysis of the monoclonal antibody. In the analysis, a continuous portion of an antigen is sequentially substituted into a different sequence. This method works well for an antibody recognizing a linear epitope, but not for that recognizing a discontinuous epitope. Although the designing the chimera proteins based on the tertiary structure information is required in such situations, there is no appropriate tool so far.  相似文献   

7.
The most popular way of comparing the performance of multiple sequence alignment programs is to use empirical testing on sets of test sequences. Several such test sets now exist, each with potential strengths and weaknesses. We apply several different alignment packages to 6 benchmark datasets, and compare their relative performances. HOMSTRAD, a collection of alignments of homologous proteins, is regularly used as a benchmark for sequence alignment though it is not designed as such, and lacks annotation of reliable regions within the alignment. We introduce this annotation into HOMSTRAD using protein structural superposition. Results on each database show that method performance is dependent on the input sequences. Alignment benchmarks are regularly used in combination to measure performance across a spectrum of alignment problems. Through combining benchmarks, it is possible to detect whether a program has been over-optimised for a single dataset, or alignment problem type.  相似文献   

8.

Background  

An algorithm is presented to compute a multiple structure alignment for a set of proteins and to generate a consensus (pseudo) protein which captures common substructures present in the given proteins. The algorithm represents each protein as a sequence of triples of coordinates of the alpha-carbon atoms along the backbone. It then computes iteratively a sequence of transformation matrices (i.e., translations and rotations) to align the proteins in space and generate the consensus. The algorithm is a heuristic in that it computes an approximation to the optimal alignment that minimizes the sum of the pairwise distances between the consensus and the transformed proteins.  相似文献   

9.
MicroRNA identification based on sequence and structure alignment   总被引:20,自引:0,他引:20  
MOTIVATION: MicroRNAs (miRNA) are approximately 22 nt long non-coding RNAs that are derived from larger hairpin RNA precursors and play important regulatory roles in both animals and plants. The short length of the miRNA sequences and relatively low conservation of pre-miRNA sequences restrict the conventional sequence-alignment-based methods to finding only relatively close homologs. On the other hand, it has been reported that miRNA genes are more conserved in the secondary structure rather than in primary sequences. Therefore, secondary structural features should be more fully exploited in the homologue search for new miRNA genes. RESULTS: In this paper, we present a novel genome-wide computational approach to detect miRNAs in animals based on both sequence and structure alignment. Experiments show this approach has higher sensitivity and comparable specificity than other reported homologue searching methods. We applied this method on Anopheles gambiae and detected 59 new miRNA genes. AVAILABILITY: This program is available at http://bioinfo.au.tsinghua.edu.cn/miralign. SUPPLEMENTARY INFORMATION: Supplementary information is available at http://bioinfo.au.tsinghua.edu.cn/miralign/supplementary.htm.  相似文献   

10.
Evidence is provided that the nucleotide triplet con-sensus non-T(A/T)G (abbreviated to VWG) influences nucleosome positioning and nucleosome alignment into regular arrays. This triplet consensus has been recently found to exhibit a fairly strong 10 bp periodicity in human DNA, implicating it in anisotropic DNA bendability. It is demonstrated that the experimentally determined preferences for nucleosome positioning in native SV40 chromatin can, to a large extent, be pre-dicted simply by counting the occurrences of the period-10 VWG consensus. Nucleosomes tend to form in regions of the SV40 genome that contain high counts of period-10 VWG and/or avoid regions with low counts. In contrast, periodic occurrences of the dinucleotides AA/TT, implicated in the rotational positioning of DNA in nucleosomes, did not correlate with the preferred nucleosome locations in SV40 chromatin. Periodic occurrences of AA did correlate with preferred nucleosome locations in a region of SV40 DNA where VWG occurrences are low. Regular oscillations in period-10 VWG counts with a dinucleosome period were found in vertebrate DNA regions that aligned nucleosomes into regular arrays in vitro in the presence of linker histone. Escherichia coli and plasmid DNA, which fail to align nucleosomes in vitro, lacked these regular VWG oscillations.  相似文献   

11.
The primary structure of the Citrus ichangensis satellite DNA repeating unit has been estimated. The repeat is 181 bp long and contains four pentanucleotides of adenine residues. Oligomer forms of the stDNA repeating unit were detected by a partial hydrolysis of the C ichangensis stDNA by BspI restriction endonuclease. Experiments on comparative mobility of oligomers in agarose and polyacrylamide gels evidenced a certain retardation of those in polyacrylamide gel indicating to a slight bend in the repeating unit. The BEN computer program [9] was employed to calculate the spatial positions of monomer and oligomer axes of the satellite DNA repeating unit of Citrus ichangensis, mouse and African green monkey, and to plot their two-dimensional projections. The bends in the monomer for higher oligomer form proved to result in a hypothetical solenoid-like structure, termed coiled double helix (CDH).  相似文献   

12.
13.
Gordon M. Crippen 《Biopolymers》1977,16(10):2189-2201
The x-ray crystal structures of 19 selected proteins are examined empirically for correlations between the amino acid sequence and long-range, tertiary conformation. There is clear evidence for preferential associations of certain types of amino acids, particularly among the hydrophobic aliphatic, aromatic, and cysteine residues. However, the likelihoods of forming these residue-pair contacts are all less than 12%, so packing and geometric requirements must often take precedent over energetic considerations. The prediction of long-range contacts is not substantially improved by taking into account the sequentially previous residues. The analysis of atom–atom contacts shows a similar lack of predictive ability, but the results show that a good approximation to the interresidue energy function must include different types of interactions at two or three different sites on some amino acids. Backbone–backbone long-range interactions are relatively rare and nonspecific, whereas some “polar” side chains form hydrogen bonds from the polar groups while occasionally forming hydrophobic contacts with the remainder of the chain.  相似文献   

14.
Mitochondrial DNA (mtDNA) sequences are widely used for inferring the phylogenetic relationships among species. Clearly, the assumed model of nucleotide or amino acid substitution used should be as realistic as possible. Dependence among neighboring nucleotides in a codon complicates modeling of nucleotide substitutions in protein-encoding genes. It seems preferable to model amino acid substitution rather than nucleotide substitution. Therefore, we present a transition probability matrix of the general reversible Markov model of amino acid substitution for mtDNA-encoded proteins. The matrix is estimated by the maximum likelihood (ML) method from the complete sequence data of mtDNA from 20 vertebrate species. This matrix represents the substitution pattern of the mtDNA-encoded proteins and shows some differences from the matrix estimated from the nuclear-encoded proteins. The use of this matrix would be recommended in inferring trees from mtDNA-encoded protein sequences by the ML method. Received: 3 May 1995 / Accepted: 31 October 1995  相似文献   

15.
16.
On the computation of the tertiary structure of globular proteins   总被引:1,自引:0,他引:1  
A method is presented to compute the approximate locations of α carbon atoms of proteins using experimentally obtainable information. This information consists of distances between nearest neighbor α carbon atoms, locations of SS bonds, primary sequence of amino acids as reflected by hydrophobic and hydrophylic residues and the assumption of globularity. The method permits the reconstruction of structure similar to the real ones, and is readily extendable to compute structures more accurately by incorporating additional information.  相似文献   

17.
Determining structural similarities between proteins is an important problem since it can help identify functional and evolutionary relationships. In this paper, an algorithm is proposed to align two protein structures. Given the protein backbones, the algorithm finds a rigid motion of one backbone onto the other such that large substructures are matched. The algorithm uses a representation of the backbones that is independent of their relative orientations in space and applies dynamic programming to this representation to compute an initial alignment, which is then refined iteratively. Experiments indicate that the algorithm is competitive with two well-known algorithms, namely DALI and LOCK.  相似文献   

18.
19.
20.
Analyses of the conformational dynamics of the numerous cellular ribonucleoprotein particles (RNP) significantly contribute to the understanding of their modes of action. Here, we tested whether ribonuclease fusion proteins incorporated into RNPs can be used as molecular probes to characterize the local RNA environment of these proteins. Fusion proteins of micrococcal nuclease (MNase) with ribosomal proteins were expressed in S. cerevisae to produce in vivo recombinant ribosomes which have a ribonuclease tethered to specific sites. Activation of the MNase activity by addition of calcium led to specific rRNA cleavage events in proximity to the ribosomal binding sites of the fusion proteins. The dimensions of the RNP environment which could be probed by this approach varied with the size of the linker sequence between MNase and the fused protein. Advantages and disadvantages of the use of MNase fusion proteins for local tertiary structure probing of RNPs as well as alternative applications for this type of approach in RNP research are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号