首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Splice junction and possible branch point sequences have been collected from 177 plant introns. Consensus sequences for the 5' and 3' splice junctions and for possible branch points have been derived. The splice junction consensus sequences were virtually identical to those of animal introns except that the polypyrimidine stretch at the 3' splice junction was less pronounced in the plant introns. A search for possible branch points with sequences related to the yeast, vertebrate and fungal consensus sequences revealed a similar sequence in plant introns.  相似文献   

2.
Summary Vitreoscilla hemoglobin is involved in oxygen metabolism of this bacterium, possibly in an unusual role for a microbe. We have isolated the Vitreoscilla hemoglobin structural gene from a pUC19 genomic library using mixed oligodeoxy-nucleotide probes based on the reported amino acid sequence of the protein. The gene is expressed in Escherichia coli from its natural promoter as a major cellular protein. The nucleotide sequence, which is in complete agrecment with the known amino acid sequence of the protein, suggests the existence of promoter and ribosome binding sites with a high degree of homology to consensus E. coli upstream sequences. In the case of at least some amino acids, a codon usage bias can be detected which is different from the biased codon usage pattern in E. coli. The down-stream sequence exhibits homology with the 3 end sequences of several plant leghemoglobin genes. E. coli cells expressing the gene contain greater than fivefold more heme than controls.  相似文献   

3.
N Tolstrup  P Rouz    S Brunak 《Nucleic acids research》1997,25(15):3159-3163
Little knowledge exists about branch points in plants; it has even been claimed that plant introns lack conserved branch point sequences similar to those found in vertebrate introns. A putative branch point consensus sequence for Arabidopsis thaliana resembling the well known metazoan consensus sequence has been proposed, but this is based on search of sequences similar to those in yeast and metazoa. Here we present a novel consensus sequence found by a non-circular approach. A hidden Markov model with a fixed A nucleotide was trained on sequences upstream of the acceptor site. The consensus found by the Markov model shares features with the metazoan consensus, but differs in its details from the consensus proposed earlier. Despite the fact that branch point consensus sequences in plants are weak, we show that a prediction scheme incorporating them leads to a substantial improvement in the recognition of true acceptor sites; the false positive rate being reduced by a factor of 2. We take this as an indication that the consensus found here is the genuine one and that the branch point does play a role in the proper recognition of the acceptor site in plants.  相似文献   

4.
Context sequences of translation initiation codon in plants   总被引:17,自引:0,他引:17  
In this survey of 5074 plant genes for their AUG context sequences, purines are present at the _3 and +4 positions in about 80% of the sequences. Although this observation is similar to the vertebrate consensus sequence, the number of plant mRNAs with purines at the _3 position is lower and at the +4 position is higher than reported for vertebrate mRNAs. Higher plants have an AC-rich consensus sequence, caA(A/C)aAUGGCg as a context of translation initiator codon. Between the two major groups of angiosperms, the context of the AUG codon in dicot mRNAs is aaA(A/C)aAUGGCu which is similar to the higher-plant consensus but monocot mRNAs have c(a/c)(A/G)(A/C)cAUGGCG as a consensus which exhibits an overall similarity with the vertebrate consensus. The experimental evidence regarding the importance of the AUG context in plants is discussed.  相似文献   

5.
6.
Summary The nucleotide sequences of viroids contain features believed to be essential for the splicing of group I introns. Common sequence elements include a 16-nucleotide consensus sequence and three pairs of short sequences arranged in the same sequential order in both types of RNAs. The calculated probability of finding sequences resembling the 16-nucleotide consensus sequence in random nucleotide chains showed that at low fidelity (up to 5 mismatched nucleotides), the number of such sequences in viroids, plant viral satellite RNAs, plant viral RNAs and one plant viral DNA, group I introns and flanking exons does not significantly differ from the number expected at random. As the degree of fidelity is increased, the number in both introns and viroids, but not in exons or the other plant pathogens examined, greatly exceeds that expected in random chains. These findings suggest that viroids may have evolved from group I introns and/or that processing of viroid oligomers to monomers may have structural requirements similar to those of group I introns. The nucleotide sequences of viroids do not show close homology with two conserved regions of group II introns, the 14-base pair consensus region and the 5 terminal segment. However, close homology does exist between the conserved sequence of the 3 terminal segment of group II introns and viroids thus suggesting a possible evolutionary or functional relationship.  相似文献   

7.
The globin derived from the monomer Component IV hemoglobin of the marine annelid,Glycera dibranchiata, has been completely sequenced, and the resulting information has been used to create a structural model of the protein. The most important result is that the consensus sequence of Component IV differs by 3 amino acids from a cDNA-predicted amino acid sequence thought earlier to encode the Component IV hemoglobin. This work reveals that the histidine (E7), typical of most heme-containing globins, is replaced by leucine in Component IV. Also significant is that this sequence is not identical to any of the previously reportedGlycera dibranchiata monomer hemoglobin sequences, including the sequence from a previously reported crystal structure, but has high identity to all. A three-dimensional structual model for monomer Component IV hemoglobin was constructed using the published 1.5 å crystal structure of a monomer hemoglobin fromGlycera dibranchiata as a template. The model shows several interesting features: (1) a Phe31 (B10) that is positioned in the active site; (2) a His39 occurs in an interhelical region occupied by Pro in 98.2% of reported globin sequences; and (3) a Met41 is found at a position that emerges from this work as a previously unrecognized heme contact.  相似文献   

8.
We aligned 14 5'-leading sequences of small subunit ribulose-1,5-bisphosphate carboxylase (rbcS) genes. A strong consensus sequence ("CCTTATCAT") was located directly upstream of the TATA-box. The occurrence of this motif in other light dependent phytochrome regulated plant genes led to the calculation of two consensus matrices. With these two matrices we are able to distinguish almost all known light induced plant genes which are phytochrome regulated from non-light induced plant genes indicating, that all these genes share a common light-responsive element (LRE). The results obtained by computer analysis are discussed with regard to experimental data.  相似文献   

9.
The globin derived from the monomer Component IV hemoglobin of the marine annelid,Glycera dibranchiata, has been completely sequenced, and the resulting information has been used to create a structural model of the protein. The most important result is that the consensus sequence of Component IV differs by 3 amino acids from a cDNA-predicted amino acid sequence thought earlier to encode the Component IV hemoglobin. This work reveals that the histidine (E7), typical of most heme-containing globins, is replaced by leucine in Component IV. Also significant is that this sequence is not identical to any of the previously reportedGlycera dibranchiata monomer hemoglobin sequences, including the sequence from a previously reported crystal structure, but has high identity to all. A three-dimensional structual model for monomer Component IV hemoglobin was constructed using the published 1.5 å crystal structure of a monomer hemoglobin fromGlycera dibranchiata as a template. The model shows several interesting features: (1) a Phe31 (B10) that is positioned in the active site; (2) a His39 occurs in an interhelical region occupied by Pro in 98.2% of reported globin sequences; and (3) a Met41 is found at a position that emerges from this work as a previously unrecognized heme contact.Abbreviations used GMHX the holo-protein (including b-type heme, Glycera dibranchiata monomer hemoglobin Component X (X=2, 3, or 4) - GMGX the apo-protein, or globin, Glycera dibranchiata monomer globin derived from Component X (X=2, 3, or 4) - rec-gmg the globin derived from a recombinant holoprotein of a Glycera dibranchiata monomer hemoglobin, rec-gmh, whose sequence has been inferred from an isolated cDNA insert - CB label refers to peptides generated from cyanogen bromide cleavage of GMG4 - HPLC high-performance liquid chromatography - T label refers to peptides generated from trypsin digests of GMG4 - Mb myoglobin - MCS monomer hemoglobin crystal structure from Glycera dibranchiata. H, N-terminal sequence of GMG4 - SWMb sperm whale myoglobin  相似文献   

10.
11.
MOTIVATION: A consensus sequence for a family of related sequences is, as the name suggests, a sequence that captures the features common to most members of the family. Consensus sequences are important in various DNA sequencing applications and are a convenient way to characterize a family of molecules. RESULTS: This paper describes a new algorithm for finding a consensus sequence, using the popular optimization method known as simulated annealing. Unlike the conventional approach of finding a consensus sequence by first forming a multiple sequence alignment, this algorithm searches for a sequence that minimises the sum of pairwise distances to each of the input sequences. The resulting consensus sequence can then be used to induce a multiple sequence alignment. The time required by the algorithm scales linearly with the number of input sequences and quadratically with the length of the consensus sequence. We present results demonstrating the high quality of the consensus sequences and alignments produced by the new algorithm. For comparison, we also present similar results obtained using ClustalW. The new algorithm outperforms ClustalW in many cases.  相似文献   

12.
The complete amino acid sequence of gladiolus bulb chitinase-a (GBC-a) was determined. First the tryptic peptides from GBC-a after it was reduced and S-carboxymethylated were sequenced and then the peptides were further studied by chemical cleavage of the enzyme. GBC-a consisted of 274 amino acid residues and had a molecular mass of 30,714 Da. Two consensus sequences essential for chitinase activity by plant class III chitinases were conserved in GBC-a, although its sequence similarity with plant class III chitinases was less than 20%. Sequence comparison of GBC-a with sequences of other proteins in a protein identification resource (PIR) showed that the GBC-a sequence was 33% similar to that of narbonin, a seed storage 2S globulin from narbon beans.  相似文献   

13.
S M Halling  N Kleckner 《Cell》1982,28(1):155-163
Transposon Tn10 inserts at many sites in the bacterial chromosome, but preferentially inserts at particular hotspots. We believe we have identified the target DNA signal responsible for this specificity. We have determined the DNA sequences of 11 Tn10 insertion sites and identified a particular 6 base pair (bp) symmetrical consensus sequence (GCTNAGC) common to those sites. The sequences at some sites differ from the consensus sequence but only in limited and well defined ways. The sequences at some sites differ from the consensus sequence than do sequences at other sites, and the consensus sequence and closely related sequences are generally absent from potential target regions where Tn10 is known not to insert. Other aspects of the target DNA can significantly influence the efficiency with which a particular target site sequence is used. The 6 bp consensus sequence is symmetrically located within the 9 bp target DNA sequence that is cleaved and duplicated during Tn10 insertion. This juxtaposition of recognition and cleavage sites plus the symmetry of the perfect consensus sequence suggest that the target DNA may be both recognized and cleaved by the symmetrically disposed subunits of a single protein, as suggested for type II restriction endonucleases. There is plausible homology between the consensus sequence and the very ends of Tn10, compatible with recognition of transposon ends and target DNA by the same protein. The sequences of actual insertion sites deviate from the perfect consensus sequence in a way which suggests that the 6 bp specificity determinant may be recognized through protein-DNA contacts along the major groove of the DNA double helix.  相似文献   

14.
The branchpoint sequence and associated polypyrimidine tract are firmly established splicing signals in vertebrates. In plants, however, these signals have not been characterized in detail. The potato invertase mini-exon 2 (9 nt) requires a branchpoint sequence positioned around 50 nt upstream of the 5' splice site of the neighboring intron and a U11 element found adjacent to the branchpoint in the upstream intron (Simpson et al., RNA, 2000, 6:422-433). Utilizing the sensitivity of this plant splicing system, these elements have been characterized by systematic mutation and analysis of the effect on inclusion of the mini-exon. Mutation of the branchpoint sequence in all possible positions demonstrated that branchpoints matching the consensus, CURAY, were most efficient at supporting splicing. Branchpoint sequences that differed from this consensus were still able to permit mini-exon inclusion but at greatly reduced levels. Mutation of the downstream U11 element suggested that it functioned as a polypyrimidine tract rather than a UA-rich element, common to plant introns. The minimum sequence requirement of the polypyrimidine tract for efficient splicing was two closely positioned groups of uridines 3-4 nt long (<6 nt apart) that, within the context of the mini-exon system, required being close (<14 nt) to the branchpoint sequence. The functional characterization of the branchpoint sequence and polypyrimidine tract defines these sequences in plants for the first time, and firmly establishes polypyrimidine tracts as important signals in splicing of at least some plant introns.  相似文献   

15.
16.
Intron lariat formation between the 5' end of an intron and a branchpoint adenosine is a fundamental aspect of the first step in animal and yeast nuclear pre-mRNA splicing. Despite similarities in intron sequence requirements and the components of splicing, differences exist between the splicing of plant and vertebrate introns. The identification of AU-rich sequences as major functional elements in plant introns and the demonstration that a branchpoint consensus sequence was not required for splicing have led to the suggestion that the transition from AU-rich intron to GC-rich exon is a major potential signal by which plant pre-mRNA splice sites are recognized. The role of putative branchpoint sequences as an internal signal in plant intron recognition/definition has been re-examined. Single nucleotide mutations in putative branchpoint adenosines contained within CUNAN sequences in four different plant introns all significantly reduced splicing efficiency. These results provide the most direct evidence to date for preferred branchpoint sequences being required for the efficient splicing of at least some plant introns in addition to the important role played by AU sequences in dicot intron recognition. The observed patterns of 3' splice site selection in the introns studied are consistent with the scanning model described for animal intron 3' splice site selection. It is suggested that, despite the clear importance of AU sequences for plant intron splicing, the fundamental processes of splice site selection and splicing in plants are similar to those in animals.  相似文献   

17.
18.
19.
The heat shock protein 70 kDa sequences (HSP70) are of great importance as molecular chaperones in protein folding and transport. They are abundant under conditions of cellular stress. They are highly conserved in all domains of life: Archaea, eubacteria, eukaryotes, and organelles (mitochondria, chloroplasts). A multiple alignment of a large collection of these sequences was obtained employing our symmetric-iterative ITERALIGN program (Brocchieri and Karlin 1998). Assessments of conservation are interpreted in evolutionary terms and with respect to functional implications. Many archaeal sequences (methanogens and halophiles) tend to align best with the Gram-positive sequences. These two groups also miss a signature segment [about 25 amino acids (aa) long] present in all other HSP70 species (Gupta and Golding 1993). We observed a second signature sequence of about 4 aa absent from all eukaryotic homologues, significantly aligned in all prokaryotic sequences. Consensus sequences were developed for eight groups [Archaea, Gram-positive, proteobacterial Gram-negative, singular bacteria, mitochondria, plastids, eukaryotic endoplasmic reticulum (ER) isoforms, eukaryotic cytoplasmic isoforms]. All group consensus comparisons tend to summarize better the alignments than do the individual sequence comparisons. The global individual consensus ``matches' 87% with the consensus of consensuses sequence. A functional analysis of the global consensus identifies a (new) highly significant mixed charge cluster proximal to the carboxyl terminus of the sequence highlighting the hypercharge run EEDKKRRER (one-letter aa code used). The individual Archaea and Gram-positive sequences contain a corresponding significant mixed charge cluster in the location of the charge cluster of the consensus sequence. In contrast, the four Gram-negative proteobacterial sequences of the alignment do not have a charge cluster (even at the 5% significance level). All eukaryotic HSP70 sequences have the analogous charge cluster. Strikingly, several of the eukaryotic isoforms show multiple mixed charged clusters. These clusters were interpreted with supporting data related to HSP70 activity in facilitating chaperone, transport, and secretion function. We observed that the consensus contains only a single tryptophan residue and a single conserved cysteine. This is interpreted with respect to the target rule for disaggregating misfolded proteins. The mitochondrial HSP70 connections to bacterial HSP70 are analyzed, suggesting a polyphyletic split of Trypanosoma and Leishmania protist mitochondrial (Mt) homologues separated from Mt-animal/fungal/plant homologues. Moreover, the HSP70 sequences from the amitochondrial Entamoeba histolytica and Trichomonas vaginalis species were analyzed. The E. histolytica HSP70 is most similar to the higher eukaryotic cytoplasmic sequences, with significantly weaker alignments to ER sequences and much diminished matching to all eubacterial, mitochondrial, and chloroplast sequences. This appears to be at variance with the hypothesis that E. histolytica rather recently lost its mitochondrial organelle. T. vaginalis contains two HSP70 sequences, one Mt-like and the second similar to eukaryotic cytoplasmic sequences suggesting two diverse origins. Received: 29 January 1998 / Accepted: 14 May 1998  相似文献   

20.
Plant genomics projects involving model species and many agriculturally important crops are resulting in a rapidly increasing database of genomic and expressed DNA sequences. The publicly available collection of expressed sequence tags (ESTs) from several grass species can be used in the analysis of both structural and functional relationships in these genomes. We analyzed over 260000 EST sequences from five different cereals for their potential use in developing simple sequence repeat (SSR) markers. The frequency of SSR-containing ESTs (SSR-ESTs) in this collection varied from 1.5% for maize to 4.7% for rice. In addition, we identified several ESTs that are related to the SSR-ESTs by BLAST analysis. The SSR-ESTs and the related sequences were clustered within each species in order to reduce the redundancy and to produce a longer consensus sequence. The consensus and singleton sequences from each species were pooled and clustered to identify cross-species matches. Overall a reduction in the redundancy by 85% was observed when the resulting consensus and singleton sequences (3569) were compared to the total number of SSR-EST and related sequences analyzed (24606). This information can be useful for the development of SSR markers that can amplify across the grass genera for comparative mapping and genetics. Functional analysis may reveal their role in plant metabolism and gene evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号