共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Guosen Xie 《Journal of theoretical biology》2011,269(1):123-108
In this article, we introduce three 3D graphical representations of DNA primary sequences, which we call RY-curve, MK-curve and SW-curve, based on three classifications of the DNA bases. The advantages of our representations are that (i) these 3D curves are strictly non-degenerate and there is no loss of information when transferring a DNA sequence to its mathematical representation and (ii) the coordinates of every node on these 3D curves have clear biological implication. Two applications of these 3D curves are presented: (a) a simple formula is derived to calculate the content of the four bases (A, G, C and T) from the coordinates of nodes on the curves; and (b) a 12-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on the geometrical centers of the 3D curves. As examples, we examine similarity among the coding sequences of the first exon of beta-globin gene from eleven species and validate similarity of cDNA sequences of beta-globin gene from eight species. 相似文献
3.
We introduce a 3D graphical representation of DNA sequences based on the pairs of dual nucleotides (DNs). Based on this representation, we consider some mathematical invariants and construct two 16-component vectors associated with these invariants. The vectors are used to characterize and compare the complete coding sequence part of beta globin gene of nine different species. The examination of similarities/dissimilarities illustrates the utility of the approach. 相似文献
4.
In this paper, a novel 3D graphical representation of DNA sequence based on codons is proposed. Since there is not loss of information due to overlapping and containing loops, this representation will be useful for comparison of different DNA sequences. This 3D curve will be convenient for DNA mutations comparison specially. In continues we give a numerical characterization of DNA sequences based on the new 3D curve. This characterization facilitates quantitative comparisons of similarities/dissimilarities analysis of DNA sequences based on codons. 相似文献
5.
We introduce a novel 2D graphical representation of DNA sequences based on the pairs of the neighboring nucleotides (PNNs). Then we get the PNNs' distributions and obtain a y-M. The construction of the PNN-curve has some important advantages (1) It avoids loss of information and the PNN-curve standing for DNA sequences does not overlap or intersect with itself. (2) The novel 2D representation is more sensitive. The utility of this method can be illustrated by the examination of similarities/dissimilarities among the coding sequences of the first exon of beta-globin gene of eleven different species in Table 2. 相似文献
6.
We consider a novel 2-D graphical representation of DNA sequences according to chemical structures of bases, reflecting distribution of bases with different chemical structure, preserving information on sequential adjacency of bases, and allowing numerical characterization. The representation avoids loss of information accompanying alternative 2-D representations in which the curve standing for DNA overlaps and intersects itself. Based on this representation we present a numerical characterization approach by the leading eigenvalues of the matrices associated with the DNA sequences. The utility of the approach is illustrated on the coding sequences of the first exon of human beta-globin gene. 相似文献
7.
Sai Zou Lei Wang Junfeng Wang 《EURASIP Journal on Bioinformatics and Systems Biology》2014,2014(1):1-7
In this paper, we first present a new concept of ‘weight’ for 64 triplets and define a different weight for each kind of triplet. Then, we give a novel 2D graphical representation for DNA sequences, which can transform a DNA sequence into a plot set to facilitate quantitative comparisons of DNA sequences. Thereafter, associating with a newly designed measure of similarity, we introduce a novel approach to make similarities/dissimilarities analysis of DNA sequences. Finally, the applications in similarities/dissimilarities analysis of the complete coding sequences of β-globin genes of 11 species illustrate the utilities of our newly proposed method. 相似文献
8.
Cristina Stan Constantin P. Cristescu Eugen I. Scarlat 《Journal of theoretical biology》2010,267(4):513-518
Using chaos game representation we introduce a novel and straightforward method for identifying similarities/dissimilarities between DNA sequences of the same type, from different organisms. A matrix is associated to each CGR pattern and the similarities result from the comparison between the matrices of the sequences of interest. Three different methods of analysis of the resulting difference matrix are considered: a 3-dimensional representation giving both local and global information, a numerical characterization by defining an n-letter word similarity measure and a statistical evaluation. The method is illustrated by implementation to the study of albumin nucleotides sequences from eight mammal species taking as reference the human albumin. 相似文献
9.
10.
A fractal method to distinguish coding and non-coding sequences in a complete genome is proposed, based on different statistical behaviors between these two kinds of sequences. We first propose a number sequence representation of DNA sequences. Multifractal analysis is then performed on the measure representation of the obtained number sequence. The three exponents C(-1), C1 and C2 are selected from the result of multifractal analysis. Each DNA may be represented by a point in the three-dimensional space generated by these three-component vectors. It is shown that points corresponding to coding and non-coding sequences in the complete genome of many prokaryotes are roughly distributed in different regions. Fisher's discriminant algorithm can be used to separate these two regions in the spanned space. If the point (C(-1),C1,C2) for a DNA sequence is situated in the region corresponding to coding sequences, the sequence is discriminated as a coding sequence; otherwise, the sequence is classified as a non-coding one. For all 51 prokaryotes we considered , the average discriminant accuracies pc,pnc,qc and qnc reach 72.28%, 84.65%, 72.53% and 84.18%, respectively. 相似文献
11.
With the use of polymerases having 3′ to 5′ exonuclease activity and 3′ phosphorothioate-modified allelespecific primers,
we recently devised a SNP-operated on/off switch controlling DNA polymerization. One advantage of this novel on/off switch
is its adaptability to arrayed primer extension. To further expand its application in genetic analysis, the new on/off switch
was evaluated in discrimination of the match/mismatch status of single nucleotides upstream from the primer 3′ terminal. A
set of seven amplicons was developed with the templates differing from each other by a single nucleotide. Using this set of
amplicons, the new on/off switch was shown to be able to efficiently discriminate single nucleotide polymorphisms from the
primer 3′ terminus to the −6 position from the primer 3′ terminus. These data, illustrating the broad single nucleotide discrimination
ability of this novel on/off switch, explain why the SNP-operated on/off switch is powerful in SNP analysis, and also indicate
useful applications to genetic analysis additional to SNP assay. First, these data broaden the application of the novel on/off
switch in the analysis of mutations other than SNPs. Second, it raises a nucleotide-walking algorithm suitable for de novo array-based sequencing analysis. 相似文献
12.
The objective of these studies was to test the hypothesis that proteins that contain potential polyisoprenyl recognition sequences (PIRSs) in their transmembrane-spanning domain can bind to the polyisoprenyl (PI) glycosyl carrier lipids undecaprenyl phosphate (C55-P) and dolichyl phosphate (C95-P). A number of prokaryotic and eukaryotic glycosyltransferases that utilize PI coenzymes contain a conserved PIRS postulated to be the active PI binding domain. To study this problem, we first determined the 3D structure of a PIRS peptide, NeuE, by homonuclear 2D 1H-nuclear magnetic resonance (NMR) spectroscopy. Experimentally generated distance constraints derived from nuclear Overhauser enhancement and torsion angle constraints derived from coupling constants were used for restrained molecular dynamics and energy minimization calculations. Molecular models of the NeuE peptide were built based on calculations of energy minimization using the DGII program NMRchitect. 3D models of dolichol (C95) and C95-P were built based on our 2D 1H-NMR nuclear Overhauser enhancement spectroscopy (NOESY) results and refined by energy minimization with respect to all atoms using the AMBER (assisted modeling with energy refinements) force field. Our energy minimization studies were carried out on a conformational model of dolichol that was originally derived from small-angle X-ray scattering and molecular mechanics methods. These results revealed that the PIs are conformationally nearly identical tripartite molecules, with their three domains arranged in a coiled, helical structure. Analyses of the intermolecular cross-peaks in the 2D NOESY spectra of PIRS peptides in the presence of PIs confirmed a highly specific interaction and identified key contact amino acids in the NeuE peptide that constituted a binding motif for interacting with the PIs. These studies also showed that subtle conformational changes occurred within both the PIs and the NeuE peptide after binding. 3D structures of the resulting molecular complexes revealed that each PI could bind more than one PIRS peptide. These studies thus represent the first evidence for a direct physical interaction between specific contact amino acids in the PIRS peptides and the PIs and supports the hypothesis of a bifunctional role for the PIs. The central idea is that these superlipids may serve as a structural scaffold to organize and stabilize in functional domains PIRS-containing proteins within multiglycosyltransferase complexes that participate in biosynthetic and translocation processes. 相似文献
13.
14.
David J. Pountney Iosif Gulkarov Eleazar Vega-Saenz de Miera Douglas Holmes Michael Saganich Bernardo Rudy Michael Artman William A. Coetzee 《FEBS letters》1999,450(3):191-196
We have identified and cloned a new member of the mammalian tandem pore domain K+ channel subunit family, TWIK-originated similarity sequence, from a human testis cDNA library. The 939 bp open reading frame encodes a 313 amino acid polypeptide with a calculated Mr of 33.7 kDa. Despite the same predicted topology, there is a relatively low sequence homology between TWIK-originated similarity sequence and other members of the mammalian tandem pore domain K+ channel subunit family group. TWIK-originated similarity sequence shares a low (< 30%) identity with the other mammalian tandem pore domain K+ channel subunit family group members and the highest identity (34%) with TWIK-1 at the amino acid level. Similar low levels of sequence homology exist between all members of the mammalian tandem pore domain K+ channel subunit family. Potential glycosylation and consensus PKC sites are present. Northern analysis revealed species and tissue-specific expression patterns. Expression of TWIK-originated similarity sequence is restricted to human pancreas, placenta and heart, while in the mouse, TWIK-originated similarity sequence is expressed in the liver. No functional currents were observed in Xenopus laevis oocytes or HEK293T cells, suggesting that TWIK-originated similarity sequence may be targeted to locations other than the plasma membrane or that TWIK-originated similarity sequence may represent a novel regulatory mammalian tandem pore domain K+ channel subunit family subunit. 相似文献
15.
The high-mobility-group (HMG) chromosomal protein wheat HMGa was purified to homogeneity and tested for its binding characteristics to double-stranded DNA. Wheat HMGa was able to bind to P268, an A/T-rich fragment derived from the pea plastocyanin gene promoter, producing a small mobility shift in gel retardation assays where the bound complex was sensitive to addition of proteinase K but resistant to heat treatment of the protein, consistent with the identity of wheat HMGa as a putative HMG-I/Y protein. Gel retardation assays and southwestern hybridization analysis revealed that wheat HMGa could selectively interact with the DNA polynucleotides poly(dA).poly(dT), poly(dAdT).poly(dAdT), and poly(dG).poly(dC), but not with poly(dGdC).poly(dGdC). Surface plasmon resonance analysis determined the kinetic and affinity constants of sensor chip-immobilized wheat HMGa for double-stranded DNA 10-mers, revealing a good affinity of the protein for various dinucleotide combinations, except that of alternating GC sequence. Thus contrary to prior reports of a selectivity of wheat HMGa for A/T-rich DNA, the protein appears to be able to interact with sequences containing guanine and cytosine residues as well, except where G/C residues alternate directly in the primary sequence. 相似文献
16.
We consider to construct 4L-components vectors for a DNA primary sequence based on the L-tuple. For two DNA sequences, using the corresponding vectors, we construct a set of L × L matrices called related matrix. The mathematical characterization from the constructed matrices have been selected to characterize the degree of similarity between the two DNA sequences. The search for similar sequences of a query sequence from a database of 39 library sequences and the construction of phylogenetic tree of H5N1 avian influenza virus illustrate the utility of the matrices for DNA sequences. 相似文献
17.
Characterizing the spatial variation of allele frequencies in a population has a wide range of applications in population genetics. This article introduces a new nonparametric method, which provides a two-dimensional representation of a structural parameter called the genetical bandwidth, which describes genetic structure around arbitrary spatial locations in a study area. This parameter corresponds to the shortest distance to areas of significant allele variation, and its computation is based on the Womble's systemic function. A simulation study and application to data sets taken from the literature give evidence that the method is particularly demonstrative when the fine-scale structure is stronger than the large-scale structure, and that it is generally able to locate genetic boundaries or clines precisely. 相似文献
18.
19.
Jonathan H. Davis 《Journal of biomolecular NMR》1995,5(4):433-437
Summary 2D 15N-1H correlation spectra are ideal for measuring backbone amide populations to determine amide exchange protection factors in studies of protein folding or other structural features. Most protein NMR spectroscopists use HSQC, which has been shown to be generally superior to HMQC in both resolution and sensitivity. The refocused HSQC experiment is intrinsically less sensitive than the regular HSQC, due to T2 relaxation during the refocusing delays. However, we show here that, when high 15N resolution is needed, an optimized refocused HSQC sequence that utilizes a semi-constant time evolution period and pulsed field gradients has better signal-to-noise ratio and resolution, and integrates more accurately, than a similar HSQC. The differences are demonstrated on a 20 kDa protein. The technique can also be applied to 3D NOESY experiments to eliminate strong NH2 geminal peaks and their truncation artefacts at a modest cost in sensitivity. 相似文献
20.
Alaguraj Veluchamy Sujitha Mary Vishal Acharya Preethi Mehta Taru Deva Sankaran Krishnaswamy 《Bioinformation》2009,4(2):80-83
The HNH Database is a collection and sequence-based classification of HNH domain proteins. The database contains about 1913 HNH
domain containing proteins, and is classified into 10 subsets based on the sequence pattern. Each of these subsets has unique signature
sequences. We have shown a correlation between the subset combination and their domain association and function. Functional divergence
of this domain may be due to the combination of these conserved patterns and the large variations in the non-conserved regions. HNHDb is
freely available at http://bicmku.in:8081/hnh. 相似文献