首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
Words are irregularly distributed in genetic texts. The analysis of this irregularity leads to the notion of stationary and non-stationary words. The polyW and polyS tracts are shown to be the most non-stationary words in genetic texts (here W-[A,T], S-[G,C], a polyW tract is a sequence of A,T nucleotides and a polyS tract is a sequence of G,C nucleotides. The distribution of stationary words suggests a method for partitioning DNA into zones. The zones obtained in the case of the phage are interpreted in the light of the Dowe hypothesis of the modular structure of bacteriophage genomes.  相似文献   

2.
Genome inhomogeneity is determined mainly by WW and SS dinucleotides   总被引:1,自引:0,他引:1  
According to the hypothesis of the modular structure of DNA,genomes consist of modules of various nature which may differin statistical characteristics. Statistical analysis helps inrevealing the differences in statistical characteristics andpredicting the modular structure. In this connection the questionabout the contribution of each word of length l (l-tuple) tothe inhomogeneity of genetic text arises. The notion of stationary(i.e. relatively evenly distributed over a genome) versus non-stationaryl-tuples has been introduced previously. In this paper, thedinucleotide distributions for all long sequences from GenBankwere analyzed and it was shown that non-stationary dinucleotidesare closely associated with polyW and polyS tracts (W denotes‘weak’ nucleotides A or T, while S stands for the‘strong’ nucleotides G or C). Thus, genome inhomogeneityis shown to be determined mainly by AA, TT, GG, CC, AT, TA,GC and CG dinucleotides. It has been demonstrated that neither‘codon usage’ nor the ‘isochore model’can account for this phenomenon.  相似文献   

3.
Abstract

Most fibrous polynucleotides of general sequence exhibit secondary structures that are described adequately by regular helices with a repeated motif of only one nucleotide. Such helices exploit the fact that A:T, T:A, G:C, and C:G pairs are essentially isomorphous and have dyadically-related glycosylic bonds. Polynucleotides with regularly repeated base-sequences sometimes assume secondary structures with larger repeated motifs which reflect these base-sequences. The dinucleotide units of the Z-like forms of poly d(As4T):poly d(As4T), poly d(AC):poly d(GT) and poly d(GC):poly d(GC) are dramatic instances of this phenomenon. The wrinkled B and D forms of poly d(GC):poly d(GC) and poly d(AT):poly d(AT) are just as significant but more subtle examples. It is possible also to trap more exotic secondary structures in which the molecular asymmetric unit is even larger. There is, for example, a tetragonal form of poly d(AT):poly d(AT) which has unit cell dimensions a = b = 1.71nm, c= 7.40nm, γ = 90°. The C dimension corresponds to the pitch of a molecular helix which accommodates 24 successive nucleotide pairs arranged as a 43 helix of hexanucleotide duplexes. The great variety of nucleotide conformations which occur in these large asymmetric units has prompted us to describe them as pleiomeric, a term used in botany to describe whorls having more than the usual number of structures. Pleiomeric DNAs need not contain nucleotide conformations that are very different from one another. On the other hand, DNAs carrying nucleotides of very different conformation must be pleiomeric. This is because 4 nucleotides of different conformation are needed to join patches of secondary structure which are as different as A or B or Z. Differences in nucleotide structures may occur also between chains rather than within chains. In poly d(A):poly d(T), the purine nucleotides all contain Ci'-endo furanose rings and the pyrimidine nucleotides C2 '-endo rings. Analogous heteronomous structures may exist in DNA-RNA hybrids although these duplexes are also found to have symmetrical A-type conformations.  相似文献   

4.
Abstract

The frequencies of “words”, oligonucleotides within nucleotide sequences, reflect the genetic information contained in the sequence “texts”. Nucleotide sequences are characteristically represented by their contrast word vocabularies. Comparison of the sequences by correlating their contrast vocabularies is shown to reflect well the relatedness (unrelatedness) between the sequences. A single value, the linguistic similarity between the sequences, is suggested asa measure of sequence relatedness. Sequences as short as 1000 bases can be characterized and quantitatively related to other sequences by this technique. The linguistic sequence similarity value is used for analysis of taxonomically and functionally diverse nucleotide sequences. The similarity value is shown to be very sensitive to the relatedness of the source species, thus providing a convenient tool for taxonomic classification of species by their sequence vocabularies. Functionally diverse sequences appear distinct by their linguistic similarity values. This can be a basis for a quick screening technique for functional characterization of the sequences and for mapping functionally distinct regions in long sequences.  相似文献   

5.
Abstract

Previous studies of the dinucleotides flanking both the 5′ and 3′ ends of homooligomer tracts have shown that some flanks are consistently preferred over others (1,2). In the first preferred group, the homooligomer tracts are flanked by the same nucleotide and/or the complementary nucleotides, e.g., ATAn, TTAn, CCGn, where n=2–5. Runs flanked by nucleotides with which they cannot base pair are distinctly disfavored. (In this group A/Tn are flanked by C and/or G; Gn/Cn are flanked by A/T, e.g., CGAn, TnGG, G., AT). The frequencies of runs flanked by AorT, and G or C (“mixed” group) are as expected. Here we seek the origin of this effect and its relevance to protein-DNA interactions. Surprisingly, within the first group, runs flanked by their complements with a pyrimidine-purine junction (e.g., TTAn, CnGG) are greatly preferred. The frequencies of their purine-pyrimidine junction mirror-images is just as expected. This effect, as well as additional ones enumerated below, is seen universally in eukaryotes and in prokaryotes, although it is stronger in the former. Detailed analysis of regulatory regions shows these strong trends, particularly in GC sequences. The potential relationship to DNA conformation and DNA-protein interaction is discussed.  相似文献   

6.
7.
Frequently discussed analogy between genetic and human texts is explored by comparison of alternation of polar and non-polar amino-acid residues in proteins and alternation of consonants and vowels in human texts. In human languages, the usage of possible combinations of consonants and vowels is influenced by pronounceability of the combinations. Similarly, oligopeptide composition of proteins is influenced by requirements of protein folding and stability. One special type of structure often present in proteins is amphipathic α-helices in which polar and non-polar amino acids alternate with the period 3.5 residues, not unlike alternation of consonants and vowels. In this study, we evaluated the contribution made by amphipathic alternations to the protein sequence texts (20–24%). Their proportion is lower than respective values for alternating words in human texts (57–89%). The proteomes (full sets of proteins for selected organisms) were transformed into ranked sequences of n-grams (words of length n), including periodical amphipathic structures. Similarly, human texts were transformed into sequences of alternating consonants and vowels. Analysis of the vocabularies shows that in both types of texts (human languages and proteins) the alternating words are dominant or highly preferred, thus, strengthening the analogy between these two types of texts. The contribution of amphipathic words in the upper parts of the ranked lists for 10 analyzed proteomes varies between 58 and 74%. In human texts respective values range between 90 and 100%.  相似文献   

8.
Base substitution is one of the raw fuels that produce genetic variation and drive evolution. Recent studies have shown that the genome components affect mutation patterns to some extent. In order to infer the correlation between the Transition/Transversion ratio (Ts/Tv) and the number of immediately adjacent A&T nucleotides, we investigated 3611007 Oryza sativa SNPs (including 45462 coding SNPs, and 242811 intronic SNPs) and 32019 Arabidopsis SNPs. The results show that Ts/Tv is negatively correlated with the number of immediately adjacent A&T in O. Sativa and Arabidopsis. We further calculated AT2 (the number of SNPs whose immediately adjacent nucleotides are either A or T) and AT0 (the number of SNPs whose immediately adjacent nucleotides are either C or G) for all 6 types of SNPs. C/G SNP of O. sativa and Arabidopsis has the highest AT2/AT0, which denotes C/G SNP may be influenced by the adjacent A&T nucleotides mostly. For SNPs in O. sativa, the neighboring effect of A&T nucleotides is limited to 2 nucleotides on both sides; for SNPs in Arabidopsis, the effect extends no more than 4 nucleotides on both sides.  相似文献   

9.
Abstract

Theoretical exploration of the possible interaction of netropsin with tRNAPhe indicates that binding should occur preferentially with the major groove of the TψC stem of the macromolecule, specifically with the bases G51, U52, G53 and phosphates 52, 53, 61 and 62. This agrees with the recent crystallographic result of Rubin and Sundaralingam. It is demonstrated that the difference with respect to netropsin binding with B-DNA, where it occurs specifically in the minor groove of AT sequences, is due to the differences in the distribution of the electrostatic molecular potential generated by these different types of DNA: this potential is sequence dependent in B-DNA (located in the minor groove of AT sequences and the major groove of GC sequences), while it is sequence independent and always located in the major groove in A-RNA. The result demonstrates the major role of electrostatics in determining the location of the binding site.  相似文献   

10.
Abstract

We have synthesized two RNA fragments: a 42-mer corresponding to the full loop I sequence of the loop I region of ColE1 antisense RNA (RNA I), plus three additional Gs at the 5′-end, and a 31-mer which has 11 5′-end nucleotides (G(-2)-U9) deleted. The secondary structure of the 42-mer, deduced from one- and two-dimensional NMR spectra, consists of a stem of 11 base-pairs which contains a U-U base-pair and a bulged C base, a 7 nucleotide loop, and a single-stranded 5′ end of 12 nucleotides. The UV-melting study of the 42-mer further revealed a multi-step melting behavior with transition temperatures 32°C and 71°C clearly discernible. In conjunction with NMR melting study the major transition at 71°C is assigned to the overall melting of the stem region and the 32°C transition is assigned to the opening of the loop region. The deduced secondary structure agrees with that proposed for the intact RNA I and provides structural bases for understanding the specificity of RNase E.  相似文献   

11.
12.
Abstract

Alternating (dA-dT)n sequences in supercoiled DNA may undergo a transition to a left-handed conformation in the presence of Ni2+ ions and high NaCl concentration (Nejedlý, K., Klysik, J. and Pale?ek, E, FEBS Lett. 243, 313–317 (1989)). In this work we have found that ionic conditions necessary for the B-to-Z transition are strongly dependent on the sequences flanking the (dA-dT)n tract. In particular, the presence of 5′- homopyrimidine (C3) and 3′-homopurine (G4) blocks adjacent to the tract were found to facilitate the transition to the left- handed form. Within a constant sequence context it was found that the ionic strength required to promote the transition was inversely proportional to the length of the (dA-dT) n sequence.  相似文献   

13.
Base substitution, the most common mutation thatone or more bases substitute another, is the main causethat creates individual variation, community diversityand the evolution of species. Studying the role andmechanism of base substitution could help peopl…  相似文献   

14.
BackgroundThe aim of this study was to evaluate the OpenArray platform for genetic testing of blood donors and to assess the genotype frequencies of nucleotide-polymorphisms (SNPs) associated with venous thrombosis (G1691A and G20210A), hyperhomocysteinemia (C677T, A1298C), and hereditary hemochromatosis (C282Y, H63D and S65C) in blood donors from Sao Paulo, Brazil.MethodsWe examined 400 blood donor samples collected from October to November 2011. The SNPs were detected using OpenArray technology. The blood samples were also examined using a real-time PCR–FRET system to compare the results and determine the accuracy of the OpenArray method.ResultsWe observed 100% agreement in all assays tested, except HFE C282Y, which showed 99.75% agreement. The HFE C282Y assay was further confirmed through direct sequencing, and the results showed that OpenArray analysis was accurate. The calculated frequencies of each SNP were FV G1691A 98.8% (G/G), 1.2% (G/A); FII G2021A 99.5% (G/G), 0.5% (G/A); MTHFR C677T 45.5% (C/C), 44.8% (C/T), 9.8% (T/T); MTHFR A1298C 60.3% (A/A), 33.6% (A/C), 6.1% (C/C); HFE C282Y 96%(G/G), 4%(G/A), HFE H63D 78.1%(C/C), 20.3% (C/G), 1.6% (G/G); and HFE S65C 98.1% (A/A), 1.9% (A/T).ConclusionTaken together, these results describe the frequencies of SNPs associated with diseases and are important to enhance our current knowledge of the genetic profiles of Brazilian blood donors, although a larger study is needed for a more accurate determination of the frequency of the alleles. Furthermore, the OpenArray platform showed a high concordance rate with standard FRET RT-PCR.  相似文献   

15.
In the present study, we determined the complete mitochondrial DNA (mtDNA) sequences of two species of Cistopus, namely C. chinensis and C. taiwanicus, and conducted a comparative mt genome analysis across the class Cephalopoda. The mtDNA length of C. chinensis and C. taiwanicus are 15706 and 15793 nucleotides with an AT content of 76.21% and 76.5%, respectively. The sequence identity of mtDNA between C. chinensis and C. taiwanicus was 88%, suggesting a close relationship. Compared with C. taiwanicus and other octopods, C. chinensis encoded two additional tRNA genes, showing a novel gene arrangement. In addition, an unusual 23 poly (A) signal structure is found in the ATP8 coding region of C. chinensis. The entire genome and each protein coding gene of the two Cistopus species displayed notable levels of AT and GC skews. Based on sliding window analysis among Octopodiformes, ND1 and DN5 were considered to be more reliable molecular beacons. Phylogenetic analyses based on the 13 protein-coding genes revealed that C. chinensis and C. taiwanicus form a monophyletic group with high statistical support, consistent with previous studies based on morphological characteristics. Our results also indicated that the phylogenetic position of the genus Cistopus is closer to Octopus than to Amphioctopus and Callistoctopus. The complete mtDNA sequence of C. chinensis and C. taiwanicus represent the first whole mt genomes in the genus Cistopus. These novel mtDNA data will be important in refining the phylogenetic relationships within Octopodiformes and enriching the resource of markers for systematic, population genetic and evolutionary biological studies of Cephalopoda.  相似文献   

16.
Recombination between satellite RNAs of turnip crinkle virus.   总被引:13,自引:0,他引:13       下载免费PDF全文
  相似文献   

17.
The present study was undertaken to explore the genetic basis of caprine prolificacy and to screen indigenous goats for prolificacy associated markers of sheep in BMPR1B, GDF9 and BMP15 genes. To detect the associated mutations and identify novel allelic variants in the candidate genes, representative samples were collected from the breeding tract of indigenous goat breeds varying in prolificacy and geographic distribution. DNA was extracted and PCR amplification was done using primers designed or available in literature for the coding DNA sequence of candidate genes. Direct sequencing was done to identify the genetic variations. Mutations in the candidate genes associated with fecundity in sheep were not detected in Indian goats. Three non-synonymous SNPs (C818T, A959C and G1189A) were identified in exon 2 of GDF9 gene out of which mutation A959C has been associated with prolificacy in exotic goats. Two novel SNPs (G735A and C808G) were observed in exon 2 of BMP15 gene.  相似文献   

18.
Abstract

The analysis of association constant between dextran coupled intercalators and nucleotides revealed the base- and sequence-selective affinity to mono- and dinucleotides in aqueous solution. Acridine bound CH-Sepharose 4B, designed as the affinity stationary phase for nucleotides, also showed base- and sequence-selective affinity.

  相似文献   

19.
20.
BackgroundSNPs are the most abundant polymorphism type, and have been explored in many crop genomic studies, including rice and maize. SNP discovery in allotetraploid cotton genomes has lagged behind that of other crops due to their complexity and polyploidy. In this study, genome-wide SNPs are detected systematically using next-generation sequencing and efficient SNP genotyping methods, and used to construct a linkage map and characterize the structural variations in polyploid cotton genomes.ResultsWe construct an ultra-dense inter-specific genetic map comprising 4,999,048 SNP loci distributed unevenly in 26 allotetraploid cotton linkage groups and covering 4,042 cM. The map is used to order tetraploid cotton genome scaffolds for accurate assembly of G. hirsutum acc. TM-1. Recombination rates and hotspots are identified across the cotton genome by comparing the assembled draft sequence and the genetic map. Using this map, genome rearrangements and centromeric regions are identified in tetraploid cotton by combining information from the publicly-available G. raimondii genome with fluorescent in situ hybridization analysis.ConclusionsWe report the genotype-by-sequencing method used to identify millions of SNPs between G. hirsutum and G. barbadense. We construct and use an ultra-dense SNP map to correct sequence mis-assemblies, merge scaffolds into pseudomolecules corresponding to chromosomes, detect genome rearrangements, and identify centromeric regions in allotetraploid cottons. We find that the centromeric retro-element sequence of tetraploid cotton derived from the D subgenome progenitor might have invaded the A subgenome centromeres after allotetrapolyploid formation. This study serves as a valuable genomic resource for genetic research and breeding of cotton.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0678-1) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号