首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Wiuf C  Hein J 《Genetics》1999,151(3):1217-1228
In this article we discuss the ancestry of sequences sampled from the coalescent with recombination with constant population size 2N. We have studied a number of variables based on simulations of sample histories, and some analytical results are derived. Consider the leftmost nucleotide in the sequences. We show that the number of nucleotides sharing a most recent common ancestor (MRCA) with the leftmost nucleotide is approximately log(1 + 4N Lr)/4Nr when two sequences are compared, where L denotes sequence length in nucleotides, and r the recombination rate between any two neighboring nucleotides per generation. For larger samples, the number of nucleotides sharing MRCA with the leftmost nucleotide decreases and becomes almost independent of 4N Lr. Further, we show that a segment of the sequences sharing a MRCA consists in mean of 3/8Nr nucleotides, when two sequences are compared, and that this decreases toward 1/4Nr nucleotides when the whole population is sampled. A measure of the correlation between the genealogies of two nucleotides on two sequences is introduced. We show analytically that even when the nucleotides are separated by a large genetic distance, but share MRCA, the genealogies will show only little correlation. This is surprising, because the time until the two nucleotides shared MRCA is reciprocal to the genetic distance. Using simulations, the mean time until all positions in the sample have found a MRCA increases logarithmically with increasing sequence length and is considerably lower than a theoretically predicted upper bound. On the basis of simulations, it turns out that important properties of the coalescent with recombinations of the whole population are reflected in the properties of a sample of low size.  相似文献   

2.
The two ribozymes found in hepatitis delta virus RNA form related but non-identical secondary structures and display similar cleavage properties in vitro. Three of the non-duplex elements hypothesized to contribute nucleotides to the catalytic core vary slightly in length between the two ribozymes and the differences are conserved in clinical isolates. Possible functional relationships of the core sequence elements were tested by systematically exchanging sequences between the two ribozymes. It was found that switching two of the elements (L3 and J4/2) from one ribozyme to the other reduced cleavage activity in both. On the other hand, exchanging the third region (J1/4) resulted in enhanced activity for one ribozyme and a smaller increase in activity for the other. Combining exchanges did not reveal any compensatory interactions involving these particular elements nor did a pattern emerge that would suggest an optimal combination of core sequences for a generalized HDV ribozyme. Non-compensatory behavior reinforces the idea that the non-duplex sequences may form sequence-specific contacts with duplex portions of the ribozyme, but, in addition, these data suggest that there may be selective pressures on the ribozyme sequences in the virus that are not reflected in the in vitro self-cleavage assays.  相似文献   

3.
Nature selected certain regions of the genome for encoding proteins. Most of the sequences were used to encode only RNA. What happened to the remaining sections of the genome? It is possible that some sequences were retired and retained as non-functional entities called pseudogenes. Though several evolutionary prospects with functional endpoints exist, we looked at the possibility of hypothetical proteins correlating with the emergence of pseudogenes and potential of such genes to make novel synthetic molecules. In this commentary, we consider two key aspects: (1) does any correlation exist between hypothetical proteins and pseudogenes and (2)—can we make novel and functional proteins from pseudogenes?  相似文献   

4.
We propose a model for generating "artificial" nucleotide sequences and, by the method of mapping those sequences onto a "DNA-walk," we analyze the presence of correlation between nucleotides. Artificial sequences are constructed considering, basically, interactions between first neighbors and between more distant units. We show that long-range correlations may be favored by the occurrence of intrastrand interactions, which give a nonlinear characteristic to the sequence.  相似文献   

5.
Flexible regions in biomolecular complexes, although crucial to understanding structure–function relationships, are often unclear in high-resolution crystal structures. In this study, we showed that single-molecule techniques, in combination with computational modeling, can characterize dynamic conformations not resolved by high-resolution structure determination methods. Taking two Pif1 helicases (ScPif1 and BsPif1) as model systems, we found that, besides a few tightly bound nucleotides, adjacent solvent-exposed nucleotides interact dynamically with the helicase surfaces. The whole nucleotide segment possessed curved conformations and covered the two RecA-like domains of the helicases, which are essential for the inch-worm mechanism. The synergetic approach reveals that the interactions between the exposed nucleotides and the helicases could be reduced by large stretching forces or electrostatically shielded with high-concentration salt, subsequently resulting in reduced translocation rates of the helicases. The dynamic interactions between the exposed nucleotides and the helicases underlay the force- and salt-dependences of their enzymatic activities. The present single-molecule based approach complements high-resolution structural methods in deciphering the molecular mechanisms of the helicases.  相似文献   

6.
DNA sequences seen in the normal character-based representation appear to have a formidable mixing of the four nucleotides without any apparent order. Nucleotide frequencies and distributions in the sequences have been studied extensively, since the simple rule given by Chargaff almost a century ago that equates the total number of purines to the pyrimidines in a duplex DNA sequence. While it is difficult to trace any relationship between the bases from studies in the character representation of a DNA sequence, graphical representations may provide a clue. These novel representations of DNA sequences have been useful in providing an overview of base distribution and composition of the sequences and providing insights into many hidden structures. We report here our observation based on a graphical representation that the intra-purine and intra-pyrimidine differences in sequences of conserved genes generally follow a quadratic distribution relationship and show that this may have arisen from mutations in the sequences over evolutionary time scales. From this hitherto undescribed relationship for the gene sequences considered in this report we hypothesize that such relationships may be characteristic of these sequences and therefore could become a barrier to large scale sequence alterations that override such characteristics, perhaps through some monitoring process inbuilt in the DNA sequences. Such relationship also raises the possibility of intron sequences playing an important role in maintaining the characteristics and could be indicative of possible intron-late phenomena.  相似文献   

7.
Virtually all pre-mRNA introns begin with the sequence /GU and end with AG/ (where / indicates a border between an exon and an intron). We have previously shown that the G residues at the first and last positions of the yeast actin intron interact during the second step of splicing. In this work, we ask if other highly conserved intron nucleotides also take part in this /G-G/ interaction. Of special interest is the penultimate intron nucleotide (AG/), which is important for the second step of splicing and is in proximity to other conserved intron nucleotides. Therefore, we tested interactions of the penultimate intron nucleotide with the second intron nucleotide (/GU) and with the branch site nucleotide. We also tested two models that predict interactions between sets of three conserved intron nucleotides. In addition, we used random mutagenesis and genetic selection to search for interactions between nucleotides in the pre-mRNA. We find no evidence for other interactions between intron nucleotides besides the interaction between the first and last intron nucleotides.  相似文献   

8.
Prediction of protein-RNA interactions at the atomic level of detail is crucial for our ability to understand and interfere with processes such as gene expression and regulation. Here, we investigate protein binding pockets that accommodate extruded nucleotides not involved in RNA base pairing. We observed that most of the protein-interacting nucleotides are part of a consecutive fragment of at least two nucleotides whose rings have significant interactions with the protein. Many of these share the same protein binding cavity and more than 30% of such pairs are π-stacked. Since these local geometries cannot be inferred from the nucleotide identities, we present a novel framework for their prediction from the properties of protein binding sites.First, we present a classification of known RNA nucleotide and dinucleotide protein binding sites and identify the common types of shared 3-D physicochemical binding patterns. These are recognized by a new classification methodology that is based on spatial multiple alignment. The shared patterns reveal novel similarities between dinucleotide binding sites of proteins with different overall sequences, folds and functions. Given a protein structure, we use these patterns for the prediction of its RNA dinucleotide binding sites. Based on the binding modes of these nucleotides, we further predict an RNA fragment that interacts with those protein binding sites. With these knowledge-based predictions, we construct an RNA fragment that can have a previously unknown sequence and structure. In addition, we provide a drug design application in which the database of all known small-molecule binding sites is searched for regions similar to nucleotide and dinucleotide binding patterns, suggesting new fragments and scaffolds that can target them.  相似文献   

9.
Loop-loop interactions among nucleic acids constitute an important form of molecular recognition in a variety of biological systems. In HIV-1, genomic dimerization involves an intermolecular RNA loop-loop interaction at the dimerization initiation site (DIS), a hairpin located in the 5' noncoding region that contains an autocomplementary sequence in the loop. Only two major DIS loop sequence variants are observed among natural viral isolates. To investigate sequence and structural constraints on genomic RNA dimerization as well as loop-loop interactions in general, we randomized several or all of the nucleotides in the DIS loop and selected in vitro for dimerization-competent sequences. Surprisingly, increasing interloop complementarity above a threshold of 6 bp did not enhance dimerization, although the combinations of nucleotides forming the theoretically most stable hexanucleotide duplexes were selected. Noncanonical interactions contributed significantly to the stability and/or specificity of the dimeric complexes as demonstrated by the overwhelming bias for noncanonical base pairs closing the loop and covariations between flanking and central loop nucleotides. Degeneration of the entire loop yielded a complex population of dimerization-competent sequences whose consensus sequence resembles that of wild-type HIV-1. We conclude from these findings that the DIS has evolved to satisfy simultaneous constraints for optimal dimerization affinity and the capacity for homodimerization. Furthermore, the most constrained features of the DIS identified by our experiments could be the basis for the rational design of DIS-targeted antiviral compounds.  相似文献   

10.
We present a high throughput, versatile approach to identify RNA-protein interactions and to determine nucleotides important for specific protein binding. In this approach, oligonucleotides are coupled to microbeads and hybridized to RNA-protein complexes. The presence or absence of RNA and/or protein fluorescence indicates the formation of an oligo-RNA-protein complex on each bead. The observed fluorescence is specific for both the hybridization and the RNA-protein interaction. We find that the method can discriminate noncomplementary and mismatch sequences. The observed fluorescence reflects the affinity and specificity of the RNA-protein interaction. In addition, the fluorescence patterns footprint the protein recognition site to determine nucleotides important for protein binding. The system was developed with the human protein U1A binding to RNAs derived from U1 snRNA but can also detect RNA-protein interactions in total RNA backgrounds. We propose that this strategy, in combination with emerging coded bead systems, can identify RNAs and RNA sequences important for interacting with RNA-binding proteins on genomic scales.  相似文献   

11.
According to a currently accepted model, enzymes engage in high-rate sliding along DNA when searching for specific recognition sequences or structural elements (modified nucleotides, breaks, single-stranded DNA fragments, etc.). Such sliding requires these enzymes to possess sufficiently high affinity for DNA of any sequence. Thus, significant differences in the enzymes' affinity for specific and nonspecific DNA sequences cannot be expected, and formation of a complex between an enzyme and its target DNA unlikely contributes significantly in the enzyme specificity. To elucidate the factors providing the specificity we have analyzed many DNA replication, DNA repair, topoisomerization, integration, and recombination enzymes using a number of physicochemical methods, including a method of stepwise increase in ligand complexity developed in our laboratory. It was shown that high affinity of all studied enzymes for long DNA is provided by formation of many weak contacts of the enzymes with all nucleotide units covered by protein globules. Contacts of positively charged amino acid residues with internucleotide phosphate groups contribute most to such interactions; the contribution of each contact is very small and the full contact interface usually resembles interactions between oppositely charged biopolymer surfaces. In some cases significant contribution to the affinity is made through hydrophobic and/or van der Waals interactions of the enzymes with nucleobases. Overall, depending on the enzyme, such nonspecific interactions provide 5-8 orders of the enzyme affinity for DNA. Specific interactions of enzymes with long DNA, in contrast to contacts of enzymes with small ligands, are usually weak and comparable in efficiency with weak nonspecific contacts. The sum of specific interactions most often provides approximately one and rarely two orders of the affinity. According to structural data, DNA binding to any of the investigated enzymes is followed by a stage of DNA conformation adjustment including partial or complete DNA melting, deformation of its backbone, stretching, compression, bending or kinking, eversion of nucleotides from the DNA helix, etc. The full set of such changes is characteristic for each individual enzyme. The fact that all enzyme-dependent changes in DNA are effected through weak specific rather than strong interactions is very important. Enzyme-specific changes in DNA conformation are required for effective adjustment of reacting orbitals with accuracy about 10-15 degrees, which is possible only for specific DNA. A transition from nonspecific to specific DNA leads to an increase in the reaction rate (kcat) by 4-8 orders of magnitude. Thus, the stages of DNA conformation adjustment and catalysis proper provide the high specificity of enzyme action.  相似文献   

12.
13.
14.
Lo SL  Cai CZ  Chen YZ  Chung MC 《Proteomics》2005,5(4):876-884
Knowledge of protein-protein interaction is useful for elucidating protein function via the concept of 'guilt-by-association'. A statistical learning method, Support Vector Machine (SVM), has recently been explored for the prediction of protein-protein interactions using artificial shuffled sequences as hypothetical noninteracting proteins and it has shown promising results (Bock, J. R., Gough, D. A., Bioinformatics 2001, 17, 455-460). It remains unclear however, how the prediction accuracy is affected if real protein sequences are used to represent noninteracting proteins. In this work, this effect is assessed by comparison of the results derived from the use of real protein sequences with that derived from the use of shuffled sequences. The real protein sequences of hypothetical noninteracting proteins are generated from an exclusion analysis in combination with subcellular localization information of interacting proteins found in the Database of Interacting Proteins. Prediction accuracy using real protein sequences is 76.9% compared to 94.1% using artificial shuffled sequences. The discrepancy likely arises from the expected higher level of difficulty for separating two sets of real protein sequences than that for separating a set of real protein sequences from a set of artificial sequences. The use of real protein sequences for training a SVM classification system is expected to give better prediction results in practical cases. This is tested by using both SVM systems for predicting putative protein partners of a set of thioredoxin related proteins. The prediction results are consistent with observations, suggesting that real sequence is more practically useful in development of SVM classification system for facilitating protein-protein interaction prediction.  相似文献   

15.
Regulation of splicing in eukaryotes occurs through the coordinated action of multiple splicing factors. Exons and introns contain numerous putative binding sites for splicing regulatory proteins. Regulation of splicing is presumably achieved by the combinatorial output of the binding of splicing factors to the corresponding binding sites. Although putative regulatory sites often overlap, no extensive study has examined whether overlapping regulatory sequences provide yet another dimension to splicing regulation. Here we analyzed experimentally-identified splicing regulatory sequences using a computational method based on the natural distribution of nucleotides and splicing regulatory sequences. We uncovered positive and negative interplay between overlapping regulatory sequences. Examination of these overlapping motifs revealed a unique spatial distribution, especially near splice donor sites of exons with weak splice donor sites. The positively selected overlapping splicing regulatory motifs were highly conserved among different species, implying functionality. Overall, these results suggest that overlap of two splicing regulatory binding sites is an evolutionary conserved widespread mechanism of splicing regulation. Finally, over-abundant motif overlaps were experimentally tested in a reporting minigene revealing that overlaps may facilitate a mode of splicing that did not occur in the presence of only one of the two regulatory sequences that comprise it.  相似文献   

16.
Palindromic Units (PU or REP) were defined as DNA sequences of 40 nucleotides highly repeated on the genome of Escherichia coli and other Enterobacteriaceae. PU are found in clusters of up to six occurrences always localized in extragenic regions. By sorting the DNA sequences of the known PU containing regions into different classes, we show here for the first time that, besides the PU themselves, each PU clusters contains a number of other conserved sequence motifs. Seven such motifs were identified with the present list of PU regions. Remarkably, each PU cluster is exclusively composed of a mosaic combination of PU and of these other sequence motifs. We demonstrate directly by hybridization experiments that one of these motifs (called L) is indeed present at a large number of copies on the Escherichia coli chromosome and that its distribution follows the same species specificity as PU sequences themselves. We propose that the mosaic pattern of motif combination in PU clusters reveals a new type of bacterial genetic element which we propose to call BIME for Bacterial Interspersed Mosaic Element. The Escherichia coli genome contains about 500 BIME.  相似文献   

17.
A hypothetical three-dimensional model of the cytochrome c peroxidase . tuna cytochrome c complex is presented. The model is based on known x-ray structures and supported by chemical modification and kinetic data. Cytochrome c peroxidase contains a ring of aspartate residues with a spatial distribution on the molecular surface that is complementary to the distribution of highly conserved lysines surrounding the exposed edge of the cytochrome c heme crevice, namely lysines 13, 27, 72, 86, and 87. These lysines are known to play a functional role in the reaction with cytochrome c peroxidase, cytochrome oxidase, cytochrome c1, and cytochrome b5. A hypothetical model of the complex was constructed with the aid of a computer-graphics display system by visually optimizing hydrogen bonding interactions between complementary charged groups. The two hemes in the resulting model are parallel with an edge separation of 16.5 A. In addition, a system of inter- and intramolecular pi-pi and hydrogen bonding interactions forms a bridge between the hemes and suggests a mechanism of electron transfer.  相似文献   

18.
Aminoacyl tRNA synthetases (aaRS) are grouped into Class I and II based on primary and tertiary structure and enzyme properties suggesting two independent phylogenetic lineages. Analogously, tRNA molecules can also form two respective classes, based on the class membership of their corresponding aaRS. Although some aaRS-tRNA interactions are not extremely specific and require editing mechanisms to avoid misaminoacylation, most aaRS-tRNA interactions are rather stereospecific. Thus, class-specific aaRS features could be mirrored by class-specific tRNA features. However, previous investigations failed to detect conserved class-specific nucleotides. Here we introduce a discrete mathematical approach that evaluates not only class-specific 'strictly present', but also 'strictly absent' nucleotides. The disjoint subsets of these elements compose a unique partition, named extended consensus partition (ECP). By analyzing the ECP for both Class I and II tDNA sets from 50 (13 archaeal, 30 bacterial and 7 eukaryotic) species, we could demonstrate that class-specific tRNA sequence features do exist, although not in terms of strictly conserved nucleotides as it had previously been anticipated. This finding demonstrates that important information was hidden in tRNA sequences inaccessible for traditional statistical methods. The ECP analysis might contribute to the understanding of tRNA evolution and could enrich the sequence analysis tool repertoire.  相似文献   

19.
Here, we describe a method that offers a unique way to engineer plasmids with precision but without digestion using restriction enzymes for the insertion of DNA. The method allows the insertion of PCR fragments in between any two nucleotides within a target plasmid. The only requirement is that the amplified fragments must be embedded between DNA sequences homologous to the site in which the integration is planned. This method is an adaptation of the QuikChange Site-Directed Mutagenesis protocol. It is simpler than the existing cloning strategies and is suitable for multiparallel constructions of new plasmids. We have demonstrated its utility by constructing plasmids in which we have successfully integrated PCR fragments up to 1117 bp.  相似文献   

20.
Translational control of the GCN4 gene involves two short open reading frames in the mRNA leader (uORF1 and uORF4) that differ greatly in the ability to allow reinitiation at GCN4 following their own translation. The low efficiency of reinitiation characteristic of uORF4 can be reconstituted in a hybrid element in which the last codon of uORF1 and 10 nucleotides 3' to its stop codon (the termination region) are substituted with the corresponding nucleotides from uORF4. To define the features of these 13 nucleotides that determine their effects on reinitiation, we separately randomized the sequence of the third codon and termination region of the uORF1-uORF4 hybrid and selected mutant alleles with the high-level reinitiation that is characteristic of uORF1. The results indicate that many different A+U-rich triplets present at the third codon of uORF1 can overcome the inhibitory effect of the termination region derived from uORF4 on the efficiency of reinitiation at GCN4. Efficient reinitiation is not associated with codons specifying a particular amino acid or isoacceptor tRNA. Similarly, we found that a diverse collection of A+U-rich sequences present in the termination region of uORF1 could restore efficient reinitiation at GCN4 in the presence of the third codon derived from uORF4. To explain these results, we propose that reinitiation can be impaired by stable base pairing between nucleotides flanking the uORF1 stop codon and either the tRNA which pairs with the third codon, the rRNA, or sequences located elsewhere in GCN4 mRNA. We suggest that these interactions delay the resumption of scanning following peptide chain termination at the uORF and thereby lead to ribosome dissociation from the mRNA.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号