首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species.  相似文献   

2.
3.
The identification of conserved sequence tags (CSTs) through comparative genome analysis may reveal important regulatory elements involved in shaping the spatio-temporal expression of genetic information. It is well known that the most significant fraction of CSTs observed in human–mouse comparisons correspond to protein coding exons, due to their strong evolutionary constraints. As we still do not know the complete gene inventory of the human and mouse genomes it is of the utmost importance to establish if detected conserved sequences are genes or not. We propose here a simple algorithm that, based on the observation of the specific evolutionary dynamics of coding sequences, efficiently discriminates between coding and non-coding CSTs. The application of this method may help the validation of predicted genes, the prediction of alternative splicing patterns in known and unknown genes and the definition of a dictionary of non-coding regulatory elements.  相似文献   

4.
Does the 'non-coding' strand code?   总被引:3,自引:2,他引:1       下载免费PDF全文
The hypothesis that DNA strands complementary to the coding strand contain in phase coding sequences has been investigated. Statistical analysis of the 50 genes of bacteriophage T7 shows no significant correlation between patterns of codon usage on the coding and non-coding strands. In Bacillus and yeast genes the correlation observed is not different from that expected with random synonymous codon usage, while a high correlation seen in 52 E. coli genes can be explained in terms of an excess of RNY codons. A deficiency of UUA, CUA and UCA codons (complementary to termination) seems to be restricted to the E. coli genes, and may be due to low abundance of the relevant cognate tRNA species. Thus the analysis shows that the non-coding strand has the properties expected of a sequence complementary to a coding strand, with no indications that it encodes, or may have encoded, proteins.  相似文献   

5.
The discovery of regulatory motifs embedded in upstream regions of plants is a particularly challenging bioinformatics task. Previous studies have shown that motifs in plants are short compared with those found in vertebrates. Furthermore, plant genomes have undergone several diversification mechanisms such as genome duplication events which impact the evolution of regulatory motifs. In this article, a systematic phylogenomic comparison of upstream regions is conducted to further identify features of the plant regulatory genomes, the component of genomes regulating gene expression, to enable future de novo discoveries. The findings highlight differences in upstream region properties between major plant groups and the effects of divergence times and duplication events. First, clear differences in upstream region evolution can be detected between monocots and dicots, thus suggesting that a separation of these groups should be made when searching for novel regulatory motifs, particularly since universal motifs such as the TATA box are rare. Second, investigating the decay rate of significantly aligned regions suggests that a divergence time of ~100 mya sets a limit for reliable conserved non-coding sequence (CNS) detection. Insights presented here will set a framework to help identify embedded motifs of functional relevance by understanding the limits of bioinformatics detection for CNSs.  相似文献   

6.
7.
8.
9.
10.
In Saccharomyces cerevisiae the expression of the cargB gene (coding for ornithine aminotransferase) is submitted to dual regulation: an induction by allophanate and a specific induction process by arginine. We have determined the nucleotide sequence of the cargB gene along with its 5' region. The coding portion of the gene encodes a protein of 423 amino acid residues with a calculated Mr value of 46049. To characterize further the regulatory mechanisms modulating the expression of the gene we have analyzed fusions of several fragments of the 5' non-coding region to lacZ, compared the 5' sequences of the cargA (coding for arginase) and cargB coregulated genes and determined the nature of two constitutive cis-dominant mutations affecting the arginine control. These approaches allowed us to define three domains in the 5' non-coding region. The upstream one is implicated in the induction by allophanate. The two other domains are involved in the specific control by arginine; the target of the ARGR gene products, that mediate a positive regulation by arginine, lies upstream of another site where a repression by the CARGRI molecule occurs. The constitutive cargB+O- mutations are located in this repressor domain. The 5' non-coding region of cargA presents the same two-domain organization. These two domains contain three sequences homologous to the cargA and cargB 5' regions.  相似文献   

11.
We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship.  相似文献   

12.
13.
14.
15.
16.
17.
Complete nucleotide sequence of ovine alpha-lactalbumin mRNA   总被引:1,自引:0,他引:1  
The nucleotide sequence of ovine alpha-lactalbumin mRNA has been determined by chemical sequencing of two cDNA recombinant plasmids and a primer extension product. Ovine alpha-lactalbumin mRNA contains 723 nucleotides (excluding the poly(A) tail), with a 5' non-coding region of 26 nucleotides, followed by the 426 nucleotides of the coding region which determines a sequence signal of 19 amino acid residues and the 123 amino acid residues of mature alpha-lactalbumin. The coding region is followed by a 3' untranslated sequence of 271 nucleotides. The derived amino acid sequence of ovine pre-alpha-lactalbumin differs from that of its bovine counterpart by 8 amino acid substitutions, all but one originating from single mutations. Comparison of sequences of guinea pig, rat and human alpha-lactalbumin mRNAs with their ovine and bovine counterparts has revealed that these molecules have rapidly evolved. The highest degree of conservation was observed in the region coding for the mature protein and corresponds essentially to sequences which interact with UDP-galactosyltransferase and Ca2+ ions.  相似文献   

18.
19.
从GenBank获得大肠杆菌K-12MG1655株的全基因组序列,计算了与基因密码子偏好性相关的多个参数(Nc、CAI、GC、GC3s),对其mRNA编码区长度、形成二级结构倾向与密码子偏好性之间的关系进行了统计学分析,发现虽然翻译效率(包括翻译速度和翻译精度)是制约大肠杆菌高表达基因的密码子偏好性的主要因素,同时,mRNA编码区长度及其形成二级结构的倾向也是形成这种偏好性的不可忽略的原因,而且对偏好性有一定程度的削弱。另外对mRNA编码区形成二级结构倾向的生物学意义进行了讨论分析。  相似文献   

20.
Prediction of splice junctions in mRNA sequences.   总被引:8,自引:6,他引:2       下载免费PDF全文
K Nakata  M Kanehisa    C DeLisi 《Nucleic acids research》1985,13(14):5327-5340
A general method based on the statistical technique of discriminant analysis is developed to distinguish boundaries of coding and non-coding regions in nucleic acid sequences. In particular, the method is applied to the prediction of splicing sites in messenger RNA precursors. Information used for discrimination includes consensus sequence patterns around splice junctions, free energy of snRNA and mRNA base pairing, and statistical differences between coding and non-coding regions such as periodic appearance of specific bases in coding regions reflecting the non-random usage of degenerate codons. Given the reading frame of an exon (but not the exon/intron boundaries), the method will predict the following exon, namely, the intron to be excised out. When applied to human sequences in the GenBank database, the method correctly identified 80% of true splice junctions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号