首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
2.
Computer building and folding of fictitious transfer-RNA sequences   总被引:1,自引:0,他引:1  
P Marlière 《Biochimie》1983,65(4-5):267-273
In order to evaluate the common occurrence with which polynucleotides may adopt the cloverleaf configuration, 1150 random sequences were computer built and folded into their most stable secondary structure. Various constraints modulated the generation of the sequences: i) the base-pairing pattern, ii) the nucleotide composition, iii) the presence of assigned bases (modified or not) at certain sites, and iv) the chain length. In many cases, artificial tRNAs appear to require a more complex organization than a cloverleaf pairing scheme to achieve, as do natural molecules, the corresponding secondary structure. Moreover, the preferred foldings of sequences from 50 to 90 nucleotide long without an imposed pairing pattern usually contain two rather than three hairpin-loops. Implications concerning the emergence and the evolution of the protein-synthesis apparatus are discussed.  相似文献   

3.
4‐α‐Glucanotransferase (GTase, D ‐enzyme) catalyzes disproportionation between two short polymers of maltooligosaccharides linked by α‐1,4‐glucoside bonds. Using action modes of the potato GTase for the donor and acceptor substrates, the Monte Carlo method was applied to simulate the GTase reaction. The simulation starts from a single enzyme molecule and a finite number (105) of substrate molecules. All selection processes were performed using random numbers produced by computer. The initial substrates were from trimer to 10‐mer. In every case, the final stage was the steady‐state distribution of polymers. The steady‐state distribution by the potato GTase reaction was different from those by the hypothetical random disproportionation reaction. The simulated data from the reaction of potato GTase and trimer almost quantitatively agreed with experimental data. The mechanism of the GTase reaction was accumulation of probabilistic processes and was well simulated by the Monte Carlo method. GTase randomizes the overall distribution of chain length of the substrate. Therefore the GTase reaction is an entropy‐driven process. © 1999 John Wiley & Sons, Inc. Biopoly 50: 145–151, 1999  相似文献   

4.
Analysis of highly repeated DNA sequences of rat with EcoR1 endonuclease   总被引:2,自引:0,他引:2  
Cleavage of rat liver nuclear DNA with EcolR1 restriction endonuclease yields 14 discrete fragments ranging from 2300 to 93 base pairs in length, representing approx. 10.5% of the rat genome. Fragments of 1500, 180, and 93 base pairs are reiterated over 100 000 times; fragments of 2300, 880, 290, and 200 base pairs are reiterated over 20 000 times; the remaining fragments are present in over 1000 copies per genome. When compared to whole rate DNA, 11 were 1-5% richer in A . T base pairs and five were 1.5-2.5 times more methylated. From the criteria of the banding patterns in complete and incomplete digests, base composition and extent of methylation, none of these fragments appeared to be generated as oligomers of a basic shorter repeat. The reassociation of EcoR1 fragments was monitored on hydroxyapatite and by S1 nuclease treatment in order to assess band reiteration frequency and the possibility of interpersion or short internal repeats. The renaturation of the four smallest EcoR1 fragments gave no indication of short internal repeats from hyperpolymer formation nor interpersion with lower frequency sequences by size reduction after S1 nuclease treatment. Anomalous renaturation of several large fragments was observed, possibly due to internal repeats.  相似文献   

5.
The flexibility of alternating dA-dT sequences   总被引:3,自引:0,他引:3  
The flexibility of alternating poly (dA-dT) has been investigated by the technique of transient electric dichroism. Rotational relaxation times, which are very sensitive to changes in the end-to-end length of flexible polymers, are determined from the field free dichroism decay curves of four, well defined fragments of poly (dA-dT) ranging in size from 136 to 270 base pairs. Persistence lengths, calculated from the results of Hagerman and Zimm (Biopolymers (1981) 29, 1481-1502), are in the range 200-250 A. This makes alternating dA-dT sequences about twice as flexible as naturally occurring, "random" sequence DNA. Considering a bend around a nucleosome, for example, this difference in persistence length translates to an energy difference between poly (dA-dT) and random sequence DNA of 0.17 kT/base pair or 1 kcal per 10 base pair stretch. This energy difference is sufficiently large to suggest that dA-dT sequences could serve as markers in DNA packaging, for example, at sites where DNA must tightly bend to accommodate structures.  相似文献   

6.
The complete amino acid sequence of the alpha chain of human fibrinogen has been determined. It contains 610 amino acid residues and has a calculated molecular weight of 66,124. The chain has 10 methionines, and fragmentation with cyanogen bromide yields 11 peptides [Doolittle, R.F., Cassman, K.G., Cottrell, B.A., Friezner, S.J., Hucko, J.T., & Takagi, T. (1977) Biochemistry 16, 1703]. The arrangement of the 11 fragments was determined by the isolation of peptide overlaps from plasmic and staphylococcal protease digests of fibrinogen and/or alpha chains. In addition, certain of the cyanogen bromide fragments, preliminary reports of whose sequences have appeared previously, have been reexamined in order to resolve several discrepancies. The alpha chain is homologous with the beta and gamma chains of fibrinogen, although a large repetitive segment of unusual composition is absent from the latter two chains. The existence of this unusual segment divides the sequence of the alpha chain into three zones of about 200 residues each that are readily distinguishable on the basis of amino acid composition alone.  相似文献   

7.
The mean free energy generated from the secondary structure of RNA sequences of varying length and composition has been studied by way of probability theory. The expected boundaries or maximal and minimal values of a given distribution are explored and a method for estimating error as a function of the number of shuffled sequences is also examined. For typical nucleotide sequences found in biologically active organisms, the mean free energy, free energy distributions and errors appear to be scalable in terms of a fixed set of algorithm-dependent parameters and the nucleotide composition of the particular sequence under evaluation. In addition, a general semi-analytical formula for predicting the mean free energy is proposed which, at least to first-order approximation, can be used to rapidly predict the mean free energy of any sequence length and composition of RNA. The general methodology appears to be algorithm independent. The results are expected to provide a reference point for certain types of analysis related to structure of RNA or DNA sequences and to assist in measuring the somewhat related matter of complexity in algorithm development. Some related applications are discussed.  相似文献   

8.
Activated human complement-classical-pathway enzyme C1r has previously been shown to undergo autolytic cleavages occurring in the A chain [Arlaud, Villiers, Chesne & Colomb (1980) Biochim. Biophys. Acta 616, 116-129]. Chemical analysis of the autolytic products confirms that the A chain undergoes two major cleavages, generating three fragments, which have now been isolated and characterized. The N-terminal alpha fragment (approx. 210 residues long) has a blocked N-terminus, as does the whole A chain, whereas N-terminal sequences of fragments beta and gamma (approx. 66 and 176 residues long respectively) do not, and their N-terminal sequences were determined. Fragments alpha, beta and gamma, which are not interconnected by disulphide bridges, are located in this order within C1r A chain. Fragment gamma is disulphide-linked to the B chain of C1r, which is C-terminal in the single polypeptide chain of precursor C1r. CNBr cleavage of C1r A chain yields seven major peptides, CN1b, CN4a, CN2a, CN1a, CN3, CN4b and CN2b, which were positioned in that order, on the basis of N-terminal sequences of the methionine-containing peptides generated from tryptic cleavage of the succinylated (3-carboxypropionylated) C1r A chain. About 60% of the sequence of C1r A chain (440-460 residues long) was determined, including the complete sequence of the C-terminal 95 residues. This region shows homology with the corresponding parts of plasminogen and chymotrypsinogen and, more surprisingly, with the alpha 1 chain of human haptoglobin 1-1, a serine proteinase homologue.  相似文献   

9.
An Eulerian path approach to global multiple alignment for DNA sequences.   总被引:3,自引:0,他引:3  
With the rapid increase in the dataset of genome sequences, the multiple sequence alignment problem is increasingly important and frequently involves the alignment of a large number of sequences. Many heuristic algorithms have been proposed to improve the speed of computation and the quality of alignment. We introduce a novel approach that is fundamentally different from all currently available methods. Our motivation comes from the Eulerian method for fragment assembly in DNA sequencing that transforms all DNA fragments into a de Bruijn graph and then reduces sequence assembly to a Eulerian path problem. The paper focuses on global multiple alignment of DNA sequences, where entire sequences are aligned into one configuration. Our main result is an algorithm with almost linear computational speed with respect to the total size (number of letters) of sequences to be aligned. Five hundred simulated sequences (averaging 500 bases per sequence and as low as 70% pairwise identity) have been aligned within three minutes on a personal computer, and the quality of alignment is satisfactory. As a result, accurate and simultaneous alignment of thousands of long sequences within a reasonable amount of time becomes possible. Data from an Arabidopsis sequencing project is used to demonstrate the performance.  相似文献   

10.
Analysis of rat repetitive DNA sequences.   总被引:8,自引:0,他引:8  
Parameters of repetitive sequence organization have been measured in the rat genome. Experiments using melting, hydroxylapatite binding, and single strand specific nuclease digestion have been used to measure the number, length, and arrangement of repeated DNA sequences. Renaturation and melting or S1 nuclease digestion of 1.0 kbp DNA fragment show about 20% of rat DNA sequences are 3000-fold repeated. Renatured duplexes from 4.0 kbp DNA fragments display two repetitive size fractions after nuclease digestion. About 60% of the repeated sequences are 0.2-0.4 kbp long while the remainder are longer than 1.5 kbp. The arrangement of the repeated sequences has been measured by hydroxylapatite fractionation of DNA fragments of varying lengths bearing a repeated sequence. Repeated DNA sequences are interspersed among 2.5 kbp long nonrepeated sequences throughout more than 70% of the rat genome. There are approximately 350 different 3000-fold short repeated sequences in the rat interspersed among 600,000 nonrepeated DNA sequences.  相似文献   

11.
Having obtained the amino acid composition of a protein, chemists and molecular biologists may wish to identify the protein from this data alone. In general such data will have errors associated with them and the length of the protein may be known only approximately or not at all. In this paper a method is described which enables searching of protein sequence databases for sequences or fragments of sequences which have a composition similar to the one being sought. Such searches are generally quite discriminating as shown by the examples provided. This method has been implemented as part of the computer program Scrutineer and is being freely distributed. It is simple to use.  相似文献   

12.
Non-helical peptide fragments were isolated from rabbit skin collagen after cleavage of alpha chains with cyanogen bromide and proteases. Determination of their amino acid sequence indicated a length of 9, 16 and 25 amino acid residues for the non-helical sequences located in the N-terminal region of alpha2 and alpha1 chain and in the C-terminal region of alpha1 chain, respectively. The C-terminal sequence Tyr-Tyr hitherto considered as the genuine end of collagen alpha1 chain is in part of rabbit collagen extended by two residues, alanine and arginine. Rabbit collagen may differ considerably in its non-helical sequences from other vertebrate collagens, particularly in the C-terminal part. Some but not all of these differences are clustered in areas occupied by antigenic determinants which are recognized in the antibody response of rabbits to rat or calf collagen. On the other hand, a high homology to rabbit collagen, e.g. in the N-terminal region of rat collagen alpha1 chain or calf collagen alpha2 chain, probably prevents immunological recognition by the rabbit. The degree of foreignness alone, however, may not necessarily determine whether a particular non-helical area is able to express immunogenic activity.  相似文献   

13.
Tandemly repeated DNA sequences generated from single synthetic oligonucleotide monomers are useful for many purposes. With conventional ligation procedures low yields and random orientation of oligomers makes cloning of defined repeated sequences difficult. We solved these problems using 2 bp overhangs to direct orientation and random incorporation of linkers containing restriction sites during ligation. Ligation products are amplified by PCR using the linker oligonucleotides as primers. Restriction digestion of the PCR products generate multimer distributions whose length is controlled by the monomer/linker ratio. The concatenated DNA fragments of defined length, orientation and spacing can be directly used for subcloning or other applications without further treatment.  相似文献   

14.
Subword composition plays an important role in a lot of analyses of sequences. Here we define and study the "local decoding of order N of sequences," an alternative that avoids some drawbacks of "subwords of length N" approaches while keeping informations about environments of length N in the sequences ("decoding" is taken here in the sense of hidden Markov modeling, i.e., associating some state to all positions of the sequence). We present an algorithm for computing the local decoding of order N of a given set of sequences. Its complexity is linear in the total length of the set (whatever the order N) both in time and memory space. In order to show a use of local decoding, we propose a very basic dissimilarity measure between sequences which can be computed both from local decoding of order N and composition in subwords of length N. The accuracies of these two dissimilarities are evaluated, over several datasets, by computing their linear correlations with a reference alignment-based distance. These accuracies are also compared to the one obtained from another recent alignment-free comparison.  相似文献   

15.
We study an algorithm which allows sequences of binary numbers (strings) to interact with each other. The simplest system of this kind with a population of 4-bit sequences is considered here. Previously proposed folding methods are used to generate alternative two-dimensional forms of the binary sequences. The interaction of two-dimensional and one-dimensional forms of strings is simulated in a serial computer. The reaction network for the N = 4 system is established. Development of string populations initially generated randomly is observed. Nonlinear rate equations are proposed which provide a model for this simplest system.Dedicated to Professor Hermann Haken on the occasion of his 65th Birthday  相似文献   

16.
The DNA sequence organization of a homogeneously staining region (HSR) in the germ line of Mus musculus was studied with DNA clones generated by microdissection and microcloning. Six HSR-derived microclones were selected and characterized by Southern blot hybridizations. Four represented single-copy mouse DNA sequences. They were amplified in the HSR as fragments co-migrating with the respective normal mouse sequence and as additional fragments of different mobilities. The copy number of co-migrating fragments was approximately 16 for each of the four sequences but the number of rearranged fragments varied. Two microclones contained DNA sequences not detectable in normal mouse genomes but present, and one of them amplified, in the HSR. The observations suggest that the HSR developed from a part of the mouse genome by alternating replication and rearrangement events, with a specific integration of putative foreign DNA sequences.  相似文献   

17.
R A Firtel  K Kindle 《Cell》1975,5(4):401-411
The length and interspersion of reiterated and single-copy DNA sequences in Dictyostelium have been examined. The results indicate that approximately 50-60% of the single-copy sequences in DNA fragments 1500 nucleotides long and 75% of the single-copy sequences in fragments 3000 nucleotides long are linked to short interspersed repeat DNA sequences. The average length of these single-copy sequences is 1500 nucleotides. The length of the reiterated DNA has also been analyzed and shows a bimodal distribution. One half is present in sequences greater than 2000 nucleotides long, while the remainder is present as short fragments 250-450 nucleotides long. These shorter fragments are interspersed with the bulk of the single-copy DNA.  相似文献   

18.
MOTIVATION: The global alignment of protein sequence pairs is often used in the classification and analysis of full-length sequences. The calculation of a Z-score for the comparison gives a length and composition corrected measure of the similarity between the sequences. However, the Z-score alone, does not indicate the likely biological significance of the similarity. In this paper, all pairs of domains from 250 sequences belonging to different SCOP folds were aligned and Z-scores calculated. The distribution of Z-scores was fitted with a peak distribution from which the probability of obtaining a given Z-score from the global alignment of two protein sequences of unrelated fold was calculated. A similar analysis was applied to subsequence pairs found by the Smith-Waterman algorithm. These analyses allow the probability that two protein sequences share the same fold to be estimated by global sequence alignment. RESULTS: The relationship between Z-score and probability varied little over the matrix/gap penalty combinations examined. However, an average shift of +4.7 was observed for Z-scores derived from global alignment of locally-aligned subsequences compared to global alignment of the full-length sequences. This shift was shown to be the result of pre-selection by local alignment, rather than any structural similarity in the subsequences. The search ability of both methods was benchmarked against the SCOP superfamily classification and showed that global alignment Z-scores generated from the entire sequence are as effective as SSEARCH at low error rates and more effective at higher error rates. However, global alignment Z-scores generated from the best locally-aligned subsequence were significantly less effective than SSEARCH. The method of estimating statistical significance described here was shown to give similar values to SSEARCH and BLAST, providing confidence in the significance estimation. AVAILABILITY: Software to apply the statistics to global alignments is available from http://barton.ebi.ac.uk. CONTACT: geoff@ebi.ac.uk  相似文献   

19.
Eukaryotic genomes contain many endogenous retroviral sequences (ERVs). ERVs are often severely mutated, therefore difficult to detect. A platform independent (Java) program package, RetroTector (ReTe), was constructed. It has three basic modules: (i) detection of candidate long terminal repeats (LTRs), (ii) detection of chains of conserved retroviral motifs fulfilling distance constraints and (iii) attempted reconstruction of original retroviral protein sequences, combining alignment, codon statistics and properties of protein ends. Other features are prediction of additional open reading frames, automated database collection, graphical presentation and automatic classification. ReTe favors elements >1000-bp long due to its dependence on order of and distances between retroviral fragments. It detects single or low-copy-number elements. ReTe assigned a 'retroviral' score of 890-2827 to 10 exogenous retroviruses from seven genera, and accurately predicted their genes. In a simulated model, ReTe was robust against mutational decay. The human genome was analyzed in 1-2 days on a LINUX cluster. Retroviral sequences were detected in divergent vertebrate genomes. Most ReTe detected chains were coincident with Repeatmasker output and the HERVd database. ReTe did not report most of the evolutionary old HERV-L related and MalR sequences, and is not yet tailored for single LTR detection. Nevertheless, ReTe rationally detects and annotates many retroviral sequences.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号