共查询到20条相似文献,搜索用时 15 毫秒
1.
Based on a five-letter model of the 20 amino acids, we propose a new 2-D graphical representation of protein sequence. Then we transform the 2-D graphical representation into a numerical characterization that will facilitate quantitative comparisons of protein sequences. As an application, we construct the phylogenetic tree of 56 coronavirus spike proteins. The resulting tree agrees well with the established taxonomic groups. 相似文献
2.
3.
We introduce a novel 2D graphical representation of DNA sequences based on the pairs of the neighboring nucleotides (PNNs). Then we get the PNNs' distributions and obtain a y-M. The construction of the PNN-curve has some important advantages (1) It avoids loss of information and the PNN-curve standing for DNA sequences does not overlap or intersect with itself. (2) The novel 2D representation is more sensitive. The utility of this method can be illustrated by the examination of similarities/dissimilarities among the coding sequences of the first exon of beta-globin gene of eleven different species in Table 2. 相似文献
4.
We consider a novel 2-D graphical representation of DNA sequences according to chemical structures of bases, reflecting distribution of bases with different chemical structure, preserving information on sequential adjacency of bases, and allowing numerical characterization. The representation avoids loss of information accompanying alternative 2-D representations in which the curve standing for DNA overlaps and intersects itself. Based on this representation we present a numerical characterization approach by the leading eigenvalues of the matrices associated with the DNA sequences. The utility of the approach is illustrated on the coding sequences of the first exon of human beta-globin gene. 相似文献
5.
Sai Zou Lei Wang Junfeng Wang 《EURASIP Journal on Bioinformatics and Systems Biology》2014,2014(1):1-7
In this paper, we first present a new concept of ‘weight’ for 64 triplets and define a different weight for each kind of triplet. Then, we give a novel 2D graphical representation for DNA sequences, which can transform a DNA sequence into a plot set to facilitate quantitative comparisons of DNA sequences. Thereafter, associating with a newly designed measure of similarity, we introduce a novel approach to make similarities/dissimilarities analysis of DNA sequences. Finally, the applications in similarities/dissimilarities analysis of the complete coding sequences of β-globin genes of 11 species illustrate the utilities of our newly proposed method. 相似文献
6.
7.
We have shown earlier that analysis of DNA sequences using atwo-dimensional graphical representation provides considerableinformation on new global sequence patterns and homologies,repeated structures, relative base abundances, probable evolutionarypaths and evolutionary divergence. We have also reported thatat a more micro level the graphical representation reveals distinctdifferences in the features of intron and exon segments of eukaryoticsequences. In this paper, the distinguishing features of theintron and exon segments are exploited to show, through severalexamples of different gene structures, that an averaging procedureover the slopes of the representative maps provides an easytechnique to differentiate between probable in Iron and exonregions. We thus expect that this method will enable a rapidsearch and preliminary indication of possible locations of proteincoding regions in long eukaryotic sequences. 相似文献
8.
A new approach using a 3-D Cartesian coordinate system to represent protein sequences has been derived. By the 3-D Graphical representation we make a comparison of sequences belonging to nine different proteins. 相似文献
9.
We have developed a program for the graphic representation andmanipulation of DNA sequences. The program (named CARTE fromthe French for map) is intended as a tool in theplanning and analysis of recombinant DNA experiments. DNA sequencesare represented as standard restriction maps, using any desiredcombination of restriction enzymes. Features of interest, suchas promoters or coding sequences, can be highlighted. The sequencecan be manipulated to mimic cloning, using deletions, insertionsor replacements at specified sites. This process is facilitatedby the simultaneous display of a graphic map of the entire sequence,a detailed picture of the work in progress, and a menu of functions.
Received on November 17, 1986; accepted on March 12, 1987 相似文献
10.
G W Rowe 《Journal of theoretical biology》1985,112(2):433-444
Multi-dimensional scaling is applied to our codon space data on the protein coding sequences of DNA from a wide variety of organisms in an attempt to find the smallest number of parameters which will accurately represent these sequences. I find that a three-dimensional representation is satisfactory. One of the three resulting co-ordinates separates eukaryotes and their associated viruses from prokaryotes and their associated phages, while an orthogonal co-ordinate separates those organisms capable of synthesizing proteins (eukaryotes and prokaryotes) from those not so capable (viruses and phages). Mitochondria show no relation in our plots to any of these groups. 相似文献
11.
12.
We introduce a 3D graphical representation of DNA sequences based on the pairs of dual nucleotides (DNs). Based on this representation, we consider some mathematical invariants and construct two 16-component vectors associated with these invariants. The vectors are used to characterize and compare the complete coding sequence part of beta globin gene of nine different species. The examination of similarities/dissimilarities illustrates the utility of the approach. 相似文献
13.
In this study, a simple 4k-dimension feature representation vector is proposed to reconstruct phylogenetic trees, where k is the length of a word. The vector is composed of elements which characterize the relative difference of biological sequence from sequence generated by an independent random process. In addition, the variance of a vector which is obtained by averaging every column of feature representation matrix is employed to determine appropriate word length. In our experiments, reliable results can always be generated when word length is <7 which appears to be of lower computational complexity. Phylogenetic trees of 24 transferrins and 48 Hepatitis E viruses reconstructed at word length 6 are in good agreements with previous study, it shows that our method is efficient and powerful. 相似文献
14.
We consider the problem of comparing several nucleic acid sequencesto identify words occurring imperfectly (patterns with no gap)with unusual frequency. Methods for computing, representing,and inspecting interactively the structure of such repeatingmotifs in nucleic acids and more generally any text are described.Multiple sequences are treated as one large concatenate. Ina preprocessing step, a lexical index is created to providerapid string matching for the enumeration of the words matchinga pattern. For given word features (word length, minimal frequency),a sequence profile is displayed. The profile can be inspectedinteractively with on-line algorithms. Applications to the identificationof regulatory elements in DNA regions involved in the controlof gene expression are presented. Our program (DNA-Lexemics)runs on the Macintosh. 相似文献
15.
16.
17.
Selective amplification in PCR is principally determined by the sequence of the primers and the temperature of the annealing step. We have developed a new PCR technique for distinguishing related sequences in which additional selectivity is dependent on sequences within the amplicon. A 5′ extension is included in one (or both) primer(s) that corresponds to sequences within one of the related amplicons. After copying and incorporation into the PCR product this sequence is then able to loop back, anneal to the internal sequences and prime to form a hairpin structure—this structure is then refractory to further amplification. Thus, amplification of sequences containing a perfect match to the 5′ extension is suppressed while amplification of sequences containing mismatches or lacking the sequence is unaffected. We have applied Headloop PCR to DNA that had been bisulphite-treated for the selective amplification of methylated sequences of the human GSTP1 gene in the presence of up to a 105-fold excess of unmethylated sequences. Headloop PCR has a potential for clinical application in the detection of differently methylated DNAs following bisulphite treatment as well as for selective amplification of sequence variants or mutants in the presence of an excess of closely related DNA sequences. 相似文献
18.
GRR: graphical representation of relationship errors 总被引:17,自引:0,他引:17
SUMMARY: A graphical tool for verifying assumed relationships between individuals in genetic studies is described. GRR can detect many common errors using genotypes from many markers. AVAILABILITY: GRR is available at http://bioinformatics.well.ox.ac.uk/GRR. 相似文献
19.
Thermodynamic properties of DNA sequences: characteristic values for the human genome 总被引:1,自引:0,他引:1
MOTIVATION: Central to many molecular biology techniques as ubiquitous as PCR and Southern blotting is the design of oligonucleotide (oligo) probes and/or primers possessing specific thermodynamic properties. Here, we use validated theoretical methods to generate distributions of predicted thermodynamic properties for DNA oligos of various lengths. These distributions facilitate immediate appreciation of typical thermodynamic values for oligos of various lengths. RESULTS: Distributions of melting temperature (Tm), free energy (DeltaG(T)o), and fraction hybridized or fraction bound (Fb), are presented for oligos of length 10-50 bases sampled from the human genome. The effects of changing temperature, oligo and salt concentrations, constraining G+C content, and introducing mismatches are exemplified. Our results provide the first survey of typical and limiting thermodynamic values evaluated on a genomic scale. Described numbers comprise useful 'rules of thumb' that are applicable to most technologies dependent upon DNA oligo design. 相似文献
20.
In this paper, a novel 3D graphical representation of DNA sequence based on codons is proposed. Since there is not loss of information due to overlapping and containing loops, this representation will be useful for comparison of different DNA sequences. This 3D curve will be convenient for DNA mutations comparison specially. In continues we give a numerical characterization of DNA sequences based on the new 3D curve. This characterization facilitates quantitative comparisons of similarities/dissimilarities analysis of DNA sequences based on codons. 相似文献