首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We develop a quantitative method for analyzing repetitions of identical short oligomers in coding and noncoding DNA sequences. We analyze sequences presently available in the GenBank separately for primate, mammal, vertebrate, rodent, invertebrate and plant taxonomic partitions. We find that some oligomers "cluster" more than they would if randomly distributed, while other oligomers "repel" each other. To quantify this degree of clustering, we define clustering measures. We find that (i) clustering significantly differs in coding and noncoding DNA; (ii) in most cases, monomers, dimers and tetramers cluster in noncoding DNA but appear to repel each other in coding DNA. (iii) The degree of clustering for different sources (primates, invertebrates, and plants) is more conserved among these sources in the case of coding DNA than in the case of noncoding DNA. (iv) In contrast to other oligomers, we find that trimers always prefer to cluster. (v) Clustering of each particular oligomer is conserved within the same organism.  相似文献   

2.
The "universal correlation" (D'Onofrio, G., Bernardi, G., 1992. A universal compositional correlation among codon positions. Gene 110, 81-88.) that holds between and or ( values are the average values of the coding sequences of each genome analyzed) at both the inter- and intra-genomic level, was re-analyzed on a vastly larger dataset. The results showed a slight, but significant, difference in the vs. correlations exhibited by prokaryotes and eukaryotes. This finding prompted an analysis of the correlation between and the amino acid frequencies in the encoded proteins, which has shown that positive correlations exist between values of coding sequences and the hydropathy of the corresponding proteins. These correlations are due to the fact that hydrophobic and amphypathic amino acids increase, whereas hydrophilic amino acids decrease with increasing values. Hydropathy values of prokaryotic proteins are systematically higher than those of eukaryotes, but the slopes of the regression lines are identical. The lower hydrophobicity of eukaryotic proteins is due to differences in the amino acid composition. In particular, the twofold higher cysteine (and disulfide bond) level of eukaryotic proteins compared to prokaryotic proteins most probably compensates for their lower hydrophobicity. This supports the viewpoint that hydrophobicity plays a structural and functional role as far as protein stability is concerned.  相似文献   

3.
4.
5.
P McCaldon  P Argos 《Proteins》1988,4(2):99-122
We have examined oligopeptides with lengths ranging from 2 to 11 residues in protein sequences that show no obvious evolutionary relationship. All sequences in the Protein Identification Resource database were carefully classified by sensitive homology searches into superfamilies to obtain unbiased oligopeptide counts. The results, contrary to previous studies, show clear prejudices in protein sequences. The oligopeptide preferences were used to help decide the significance of sequence homologies and to improve the more general methods for detecting protein coding regions within nucleotide sequences.  相似文献   

6.
Comparison of coding nucleotide sequences of the paralogous GH1 and GH2 genes, as well as of the growth hormone amino acid sequences, in the species of closely related salmonid genera Salvelinus, Oncorhynchus, and Salmo was performed. It was demonstrated that, in different groups of salmonids, the amino acid substitution rates were considerably different. In some cases, an obvious discrepancy between the divergence of growth hormone genes and phylogenetic schemes based on other methods and approaches was revealed. These findings suggest that the reason may be multidirectional selection at duplicated genes at different stages of evolution.  相似文献   

7.
8.
E. coli strains producing a hybrid protein, containing adrenocorticotropic hormone (ACTH) and protein A of S. aureus was obtained. The sequence coding for ACTH was obtained from the bovine proopiomelanocortin cDNA and, after the modification of the 5'- and 3'-terminal parts, was linked with the protein A gene and its derivatives due to synthetic adaptors. Three forms of ACTH gene, coding this hormone with differing N-terminal amino acid were used to construct the fusion gene. The hybrid proteins contain Asp-Pro or (Asp)4-Lys sequences for obtaining ACTH by acid or enterokinase treatment, respectively. It is shown that each of the constructed plasmids direct the synthesis of hybrid protein in E. coli. This protein was purified by the use of IgG-sepharose. The level of the expression of the hybrid protein is 4 mg/l of the bacterial culture. Most of the synthesized protein is secreted into the periplasmic space.  相似文献   

9.
Guo ZF  Yan ZH  Wang JR  Wei YM  Zheng YL 《Hereditas》2005,142(2005):56-64
The high-molecular-weight (HMW) prolamines subunits and their coding sequences from wheat-related diploid species Crithopsis delileana were investigated. Only one HMW prolamine subunit with the similar electrophoresis mobility to the y-type HMW glutenin subunit of hexaploid wheat was observed in two accessions of C. delileana by SDS-PAGE analyses of the total storage protein fractions. It was confirmed by sequencing and expression analysis that this prolamine subunit was an x-type subunit. The amino acid sequence of this subunit had the similar typical structure to those of x-type HMW glutenin genes previously described in wheat. An in-frame stop codon was found in the coding sequences of y-type prolamine subunits. It was found by specifically extraction of HMW prolamines and sequence analysis that the coding regions of Ky prolamine subunit gene is very likely to be not expressed as a full-length protein. Phylogenetic analysis indicated that the Kx subunit could be clustered together with 1Ax1 subunit by an interior paralleled branch, and Ky subunit (inactive) was most closely related to the 1Ay subunit. The coding sequences of Kx subunit could successfully be expressed in bacterial expression system, and the expressed protein had the same electrophoresis mobility as the Kx subunit from the seed of C. delileana. It was the first time that the HMW prolamines subunits encoded by K genome of C. delileana were characterized.  相似文献   

10.
11.
Summary Can the anti-sense chain of DNA encode for a protein? Such a problem has been explored by means of the codon-analyzing graph developed recently.  相似文献   

12.
With the ever-increasing pace of genome sequencing, there is a great need for fast and accurate computational tools to automatically identify genes in these genomes. Although great progress has been made in the development of gene-finding algorithms during the past decades, there is still room for further improvement. In particular, the issue of recognizing short exons in eukaryotes is still not solved satisfactorily. This article is devoted to assessing various linear and kernel-based classification algorithms and selecting the best combination of Z-curve features for further improvement of the issue. Eight state-of-the-art linear and kernel-based supervised pattern recognition techniques were used to identify the short (21-192?bp) coding sequences of human genes. By measuring the prediction accuracy, the tradeoff between sensitivity and specificity and the time consumption, partial least squares (PLS) and kernel partial least squares (KPLS) algorithms were verified to be the most optimal linear and kernel-based classifiers, respectively. A surprising result was that, by making good use of the interpretability of the PLS and the Z-curve methods, 93 Z-curve features were proved to be the best selective combination. Using them, the average recognition accuracy was improved as high as 7.7% by means of KPLS when compared with what was obtained by the Fisher discriminant analysis using 189 Z-curve variables (Gao and Zhang, 2004 ). The used codes are freely available from the following approaches (implemented in MATLAB and supported on Linux and MS Windows): (1) SVM: http://www.support-vector-machines.org/SVM_soft.html. (2) GP: http://www.gaussianprocess.org. (3) KPLS and KFDA: Taylor, J.S., and Cristianini, N. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, UK. (4) PLS: Wise, B.M., and Gallagher, N.B. 2011. PLS-Toolbox for use with MATLAB: ver 1.5.2. Eigenvector Technologies, Manson, WA. Supplementary Material for this article is available at www.liebertonline.com/cmb.  相似文献   

13.
Summary The paper reports results of a long-term (1964–1974) investigation on permanent study sites in natural forest ecosystems of the Tilio-Carpinetum and the Pino-Quercetum in the Bialowieza Forest. The influence of decaying logs and root craters was investigated. It was found that the main causes of uprooting were the spring and autumn winds. Wind direction and the position of logs lying on the ground are correlated. Picea is most susceptible to uprooting by winds. Almost one half of the trees of this species are alive at the moment of uprooting.By mapping changes in the distribution of uprooted trees on a permanent area in time, a balance of the change over in a 10-year period was determined. It appeared that the decomposition is slower than accumulation. From this, it was concluded that the stand is in a phase of natural thinning. In the study site, compartments were disinguished with various degrees of change in the number of uprooted trees, and the consequences of differentiation and constant transformation of the biotope and biocenosis by the occurrence of uprooted trees and by their decay are described.Nomenclature of species follows Flora Europaea.Contribution to the Symposium of the Working Group for Succession Research on Permanent Plots, held at Yerseke, the Netherlands, October 1975.  相似文献   

14.
We investigated the relationships between the nucleotide substitution rates and the predicted secondary structures in the three states representation (alpha-helix, beta-sheet, and coil). The analysis was carried out on 34 alignments, each of which comprised sequences belonging to at least four different mammalian orders. The rates of synonymous substitution were found to be significantly different in regions predicted to be alpha-helix, beta-sheet, or coil. Likewise, the nonsynonymous rates also differ, although expectedly at a lower extent, in the three types of secondary structure, suggesting that different selective constraints associated with the different structures are affecting in a similar way the synonymous and nonsynonymous rates. Moreover, the base composition of the third codon positions is different in coding sequence regions corresponding to different secondary structures of proteins.  相似文献   

15.
Considerable progress has been made in understanding the structure, function and genetic regulation of high-molecular-weight (HMW) glutenin subunits in hexaploid wheat. In contrast, less is known about these types of proteins in wheat related species. In this paper, we report the analysis of HMW glutenin subunits and their coding sequences in two diploid Aegilops species, Aegilops umbellulata (UU) and Aegilops caudata (CC). SDS-PAGE analysis demonstrated that, for each of the four Ae. umbellulata accessions, there were two HMW glutenin subunits (designated here as 1Ux and 1Uy) with electrophoretic mobilities comparable to those of the x- and y-type subunits encoded by the Glu-D1 locus, respectively. In our previous study involving multiple accessions of Ae. caudata, two HMW glutenin subunits (designated as 1Cx and 1Cy) with electrophoretic mobilities similar to those of the subunits controlled by the Glu-D1 locus were also detected. These results indicate that the U genome of Ae. umbellulata and the C genome of Ae. caudata encode HMW glutenin subunits that may be structurally similar to those specified by the D genome. The complete open reading frames (ORFs) coding for x- and y-type HMW glutenin subunits in the two diploid species were cloned and sequenced. Analysis of deduced amino acid sequences revealed that the primary structures of the x- and y-type HMW glutenin subunits of the two Aegilops species were similar to those of previously published HMW glutenin subunits. Bacterial expression of modified ORFs, in which the coding sequence for the signal peptide was removed, gave rise to proteins with electrophoretic mobilities identical to those of HMW glutenin subunits extracted from seeds, indicating that upon seed maturation the signal peptide is removed from the HMW glutenin subunit in the two species. Phylogenetic analysis showed that 1Ux and 1Cx subunits were most closely related to the 1Dx type subunit encoded by the Glu-D1 locus. The 1Uy subunit possessed a higher level of homology to the 1Dy-type subunit compared with the 1Cy subunit. In conclusion, our study suggests that the Glu-U1 locus of Ae. umbellulata and the Glu-C1 locus of Ae. caudata specify the expression of HMW glutenin subunits in a manner similar to the Glu-D1 locus. Consequently, HMW glutenin subunits from the two diploid species may have potential value in improving the processing properties of hexaploid wheat varieties.  相似文献   

16.
We have cloned the structural genes for a regulated ( PHO5 ) and a constitutive ( PHO3 ) acid phosphatase from yeast by transformation and complementation of a yeast pho3 , pho5 double mutant. Both genes are located on a 5.1-kb BamHI fragment. The cloned genes were identified on the basis of genetic evidence and by hybrid selection of mRNA coupled with in vitro translation and immunoprecipitation. Subcloning of partial Sau3A digests and functional in vivo analysis by transformation together with DNA sequence analysis showed that the two genes are oriented in the order (5') PHO5 , PHO3 (3'). While the nucleotide sequences of the two coding regions are quite similar, the putative promoter regions show a lower degree of sequence homology. Partly divergent promoter sequences may explain the different regulation of the two genes.  相似文献   

17.
18.
19.
Low-complexity sequences are extremely abundant in eukaryotic proteins for reasons that remain unclear. One hypothesis is that they contribute to the formation of novel coding sequences, facilitating the generation of novel protein functions. Here, we test this hypothesis by examining the content of low-complexity sequences in proteins of different age. We show that recently emerged proteins contain more low-complexity sequences than older proteins and that these sequences often form functional domains. These data are consistent with the idea that low-complexity sequences may play a key role in the emergence of novel genes.  相似文献   

20.
The ribonucleotide reductase gene tandem bnrdE/bnrdF in SPβ-related prophages of different Bacillus spp. isolates presents different configurations of intervening sequences, comprising one to three of six non-homologous splicing elements. Insertion sites of group I introns and intein DNA are clustered in three relatively short segments encoding functionally important domains of the ribonucleotide reductase. Comparison of the bnrdE homologs reveals mutual exclusion of a group I intron and an intein coding sequence flanking the codon that specifies a conserved cysteine. In vivo splicing was demonstrated for all introns. However, for two of them a part of the mRNA precursor molecules remains unspliced. Intergenic bnrdEbnrdF regions are unexpectedly long, comprising between 238 and 541 nt. The longest encodes a putative polypeptide related to HNH homing endonucleases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号