首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Signals necessary for in vivo expression of Ti plasmid T-DNA-encoded octopine and nopaline synthase genes were studied in crown gall tumors by constructing mutated genes carrying various lengths of sequences upstream of the 5' initiation site of their mRNAs. Deletions upstream of position -294 did not interfere with expression of the octopine synthase gene while those extending upstream of position -170 greatly reduced the gene expression. The estimated size of the octopine synthase promoter is therefore 295 bp. The maximal length of 5' upstream sequences involved in the in vivo expression of the nopaline synthase gene is 261 bp. Our results also demonstrated that Ti plasmid-derived sequences contain all signals essential for expression of opine synthase genes in plants. Expression of these genes, therefore, is independent of the direct vicinity of the plant DNA sequences and is not activated by formation of plant DNA and T-DNA border junction.  相似文献   

3.
The identification of potential protein binding sites (cis-regulatory elements) in the upstream regions of genes is key to understanding the mechanisms that regulate gene expression. To this end, we present a simple, efficient algorithm, BEAM (beam-search enumerative algorithm for motif finding), aimed at the discovery of cis-regulatory elements in the DNA sequences upstream of a related group of genes. This algorithm dramatically limits the search space of expanded sequences, converting the problem from one that is exponential in the length of motifs sought to one that is linear. Unlike sampling algorithms, our algorithm converges and is capable of finding statistically overrepresented motifs with a low failure rate. Further, our algorithm is not dependent on the objective function or the organism used. Limiting the space of candidate motifs enables the algorithm to focus only on those motifs that are most likely to be biologically relevant and enables the algorithm to use direct evaluations of background frequencies instead of resorting to probabilistic estimates. In addition, limiting the space of candidate motifs makes it possible to use computationally expensive objective functions that are able to correctly identify biologically relevant motifs.  相似文献   

4.
INCLUSive allows automatic multistep analysis of microarray data (clustering and motif finding). The clustering algorithm (adaptive quality-based clustering) groups together genes with highly similar expression profiles. The upstream sequences of the genes belonging to a cluster are automatically retrieved from GenBank and can be fed directly into Motif Sampler, a Gibbs sampling algorithm that retrieves statistically over-represented motifs in sets of sequences, in this case upstream regions of co-expressed genes.  相似文献   

5.
酵母基因上游序列中潜在的转录正调控位点分析   总被引:3,自引:0,他引:3  
前期研究表明,高效转录酵母基因内含子在序列长度、寡核苷酸使用、以及位置分布等方面都有着区别于低转录内含子的特征 . 进一步观察发现:上游基因间区域的序列长度与基因转录频率也有与内含子序列相同的现象,转录频率高的上游基因间序列一般都比转录频率低的长 . 对高效转录和低效转录上游基因间序列的寡核苷酸使用频率进行统计比较分析,抽提出高转录基因上游区可能的转录正调控元件 . 与酵母的所有非编码序列比较,这些可能的正调控元件基本上也是过表达的 (over-represented) ,其中多数和实验所得的一些位点特征相吻合 . 这些元件富含 G 、 C ,这与内含子中可能的正调控元件在碱基组成上有一定的互补性 . 从这些特征看,高效转录基因上游的序列结构确实有利于基因的转录 .  相似文献   

6.
Alternative oxidase (Aox) is a nuclear-encoded mitochondrial protein. In soybean (Glycine max), the three members of the gene family have been shown to be differentially expressed during normal plant development and in response to stresses. To examine the function of the Aox promoters, genomic fragments were obtained for all three soybean genes: Aox1, Aox2a, and Aox2b. The regions of these fragments immediately upstream of the coding regions were used to drive beta-glucuronidase (GUS) expression during transient transformation of soybean suspension culture cells and stable transformation of Arabidopsis. The expression patterns of the GUS reporter genes in soybean cells were in agreement with the presence or absence of the various endogenous Aox proteins, determined by immunoblotting. Deletion of different portions of the upstream regions identified sequences responsible for both positive and negative regulation of Aox gene expression in soybean cells. Reporter gene analysis in Arabidopsis plants showed differential tissue expression patterns driven by the three upstream regions, similar to those reported for the endogenous proteins in soybean. The expression profiles of all five members of the Arabidopsis Aox gene family were examined also, to compare with GUS expression driven by the soybean upstream fragments. Even though the promoter activity of the upstream fragments from soybean Aox2a and Aox2b displayed the same tissue specificity in Arabidopsis as they do in soybean, the most prominently expressed endogenous genes in all tissues of Arabidopsis were of the Aox1 type. Thus although regulation of Aox expression generally appears to involve the same signals in different species, different orthologs of Aox may respond variously to these signals. A comparison of upstream sequences between soybean Aox genes and similarly expressed Arabidopsis Aox genes identified common motifs.  相似文献   

7.
We present an efficient algorithm for detecting putative regulatory elements in the upstream DNA sequences of genes, using gene expression information obtained from microarray experiments. Based on a generalized suffix tree, our algorithm looks for motif patterns whose appearance in the upstream region is most correlated with the expression levels of the genes. We are able to find the optimal pattern, in time linear in the total length of the upstream sequences. We implement and apply our algorithm to publicly available microarray gene expression data, and show that our method is able to discover biologically significant motifs, including various motifs which have been reported previously using the same data set. We further discuss applications for which the efficiency of the method is essential, as well as possible extensions to our algorithm.  相似文献   

8.
9.
Structure of the murine serum amyloid A gene family. Gene conversion   总被引:19,自引:0,他引:19  
Serum amyloid A (SAA) is an apolipoprotein produced by the liver in response to inflammation; the levels of SAA mRNA and SAA protein increase at least 500-fold within 24 h. We have obtained clones of all three genes and pseudogene that make up the murine SAA gene family. Two of the genes have 96% sequence homology over their entire length, including introns and flanking sequences 288 base pairs (bp) 5' and 443 bp 3' to the genes: an overall length of 3215 bp. The sharp boundaries between homologous and nonhomologous sequences and the absence of interspersed repeated sequences there suggest that conversion has occurred between these two genes. The homologous regions are bounded by short inverted repeats containing alternating purine and pyrimidine residues, as described for other gene conversion units. The third SAA gene has evolved separately, although all are closely linked on chromosome 7. Comparison of the upstream regions of the SAA genes with those of the rat fibrinogen genes, whose expression is also induced by inflammation, reveals sequences common to all six genes which are very improbable on a random basis.  相似文献   

10.
11.
12.
13.
The nucleotide sequences of the entire gene family, comprising six genes, that encodes the Rubisco small subunit (rbcS) multigene family in Mesembryanthemum crystallinum (common ice plant), were determined. Five of the genes are arranged in a tandem array spanning 20 kb, while the sixth gene is not closely linked to this array. The mature small subunit coding regions are highly conserved and encode four distinct polypeptides of equal lengths with up to five amino acid differences distinguishing individual genes. The transit peptide coding regions are more divergent in both amino acid sequence and length, encoding five distinct peptide sequences that range from 55 to 61 amino acids in length. Each of the genes has two introns located at conserved sites within the mature peptide-coding regions. The first introns are diverse in sequence and length ranging from 122 by to 1092 bp. Five of the six second introns are highly conserved in sequence and length. Two genes, rbcS-4 and rbcS-5, are identical at the nucleotide level starting from 121 by upstream of the ATG initiation codon to 9 by downstream of the stop codon including the sequences of both introns, indicating recent gene duplication and/or gene conversion. Functionally important regulatory elements identified in rbcS promoters of other species are absent from the upstream regions of all but one of the ice plant rbcS genes. Relative expression levels were determined for the rbcS genes and indicate that they are differentially expressed in leaves.  相似文献   

14.
15.
The nucleotide sequences of the entire gene family, comprising six genes, that encodes the Rubisco small subunit (rbcS) multigene family in Mesembryanthemum crystallinum (common ice plant), were determined. Five of the genes are arranged in a tandem array spanning 20 kb, while the sixth gene is not closely linked to this array. The mature small subunit coding regions are highly conserved and encode four distinct polypeptides of equal lengths with up to five amino acid differences distinguishing individual genes. The transit peptide coding regions are more divergent in both amino acid sequence and length, encoding five distinct peptide sequences that range from 55 to 61 amino acids in length. Each of the genes has two introns located at conserved sites within the mature peptide-coding regions. The first introns are diverse in sequence and length ranging from 122 by to 1092 bp. Five of the six second introns are highly conserved in sequence and length. Two genes, rbcS-4 and rbcS-5, are identical at the nucleotide level starting from 121 by upstream of the ATG initiation codon to 9 by downstream of the stop codon including the sequences of both introns, indicating recent gene duplication and/or gene conversion. Functionally important regulatory elements identified in rbcS promoters of other species are absent from the upstream regions of all but one of the ice plant rbcS genes. Relative expression levels were determined for the rbcS genes and indicate that they are differentially expressed in leaves.  相似文献   

16.
Highly homologous DNA elements were found to be shared by the upstream regions of the mouse tyrosinase and tyrosinase related protein (TRP-1) genes. Several nuclear proteins were shown to bind to both of these upstream regions. Shared homologous DNA elements were also found in the 5’ flanking sequences of Japanese quail and snapping turtle tyrosinase genes. Shared homologous nucleotide sequences were found to be scattered like an archipelago in the 5’ upstream regions of mouse and human tyrosinase genes. Comparisons between Japanese quail and snapping turtle tyrosinase genes gave similar results. On the contrary, mammalian (mouse and human) and nonmammalian (quail and snapping turtle) tyrosinase genes did not show significant homology in their 5’ upstream regions. In contrast, coding sequences in the first exons of vertebrate tyrosinase genes and their deduced amino acid sequences were found to be highly conserved except for their putative leader sequence-coding regions.  相似文献   

17.
This paper introduces two exact algorithms for extracting conserved structured motifs from a set of DNA sequences. Structured motifs may be described as an ordered collection of p > or = 1 "boxes" (each box corresponding to one part of the structured motif), p substitution rates (one for each box) and p - 1 intervals of distance (one for each pair of successive boxes in the collection). The contents of the boxes--that is, the motifs themselves--are unknown at the start of the algorithm. This is precisely what the algorithms are meant to find. A suffix tree is used for finding such motifs. The algorithms are efficient enough to be able to infer site consensi, such as, for instance, promoter sequences or regulatory sites, from a set of unaligned sequences corresponding to the noncoding regions upstream from all genes of a genome. In particular, both algorithms time complexity scales linearly with N2n where n is the average length of the sequences and N their number. An application to the identification of promoter and regulatory consensus sequences in bacterial genomes is shown.  相似文献   

18.
19.
The discoidin I genes of Dictyostelium form a small, co-ordinately regulated multigene family. We have sequenced and compared the upstream regions of the DiscI-alpha, -beta and -gamma genes. For the most part the upstream regions of the three genes are non-homologous. The upstream sequences of the beta and gamma genes are exceedingly A + T-rich, while those of the alpha gene are less so. All three genes have a relatively G + C-rich region 20 to 40 base-pairs in length, found approximately 200 base-pairs 5' to the messenger RNA start site. This G + C-rich region 5' to the beta and gamma genes is flanked by short inverted repeats. Within this region, there is an 11 base-pair exact homology between the alpha and gamma genes, and a less perfect homology between these genes and the beta gene. The homology is flanked at a short distance by interspersed G and T residues. The gamma gene is greater than 90% A + T for greater than 800 base-pairs upstream. Further upstream there is a G + C-rich region that is also found inverted approximately 3.5 X 10(3) base-pairs away. The gamma and beta genes are tandemly linked, and the entire approximately 500 base-pair intergene region between the 3' end of the gamma gene and the 5' end of the beta gene is A + T-rich (approximately 90%) with the exception of the homology region 5' to the gamma gene. We demonstrate also the presence of a discoidin I pseudogene fragment having only 139 base-pairs of discoidin homology with greater than 8% mismatch. It is flanked upstream by five 39 base-pair G + C-rich repeats, and downstream by sequences that are extremely A + T-rich. We discuss the possible significance of the conserved G + C-rich structures on discoidin I gene expression.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号