首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We describe a computer tool to aid the discovery of new motifsin nucleic acid sequences. A typical use would be to analysea set of upstream regions from a family of related genes inorder to find possible control sequences. The heart of the methodis the creation of dictionaries of related subsequences. Thesedictionaries can then be analysed to look for the commonestor best-defined subsequences, those that occur in the highestnumber of different sequences, or for those in equivalent positionswithin the family. We show the application of the method toa set of E. coli promoter sequences. Received on May 9, 1989; accepted on July 27, 1989  相似文献   

2.
3.
The origin binding protein (OBP) of herpes simplex virus (HSV), which is essential for viral DNA replication, binds specifically to sequences within the viral replication origin(s) (for a review, see Challberg, M.D., and Kelly, T. J. (1989) Annu. Rev. Biochem. 58, 671-717). Using either a COOH-terminal OBP protein A fusion or the full-length protein, each expressed in Escherichia coli, we investigated the interaction of OBP with one HSV origin, OriS. Binding of OBP to a set of binding site variant sequences demonstrates that the 10-base pair sequence, 5' CGTTCGCACT 3', comprises the OBP-binding site. This sequence must be presented in the context of at least 15 total base pairs for high affinity binding, Ka = approximately 0.3 nM. Single base pair mutations in the central CGC sequence lower the affinity by several orders of magnitude, whereas a substitution at any of the other seven positions reduces the affinity by 10-fold or less. OBP binds with high affinity to duplex DNA containing mismatched base pairs. This property is exploited to analyze OBP binding to DNA heteroduplexes containing singly substituted mutant and wild-type DNA strands. For positions 2, 3, 5, 6, 7, 8, and 9, substitutions are tolerated on one or the other DNA strand, indicating that base-mediated interactions are limited to one base of each pair. For both Boxes I and II, these interactions are localized to one face of the DNA helix, forming a recognition surface in the major groove. In OriS, the 31 base pairs which separate Boxes I and II orient the two interaction surfaces to the same side of the DNA.  相似文献   

4.
Roy S  Sahu A  Adhya S 《Gene》2002,285(1-2):169-173
A gene regulatory protein with helix-turn-helix (HTH) DNA-binding motif, GalS contains a functional operator within the DNA sequences encoding the HTH region (Nature 369 (1994) 314). We searched for operator-like sequences within the DNA sequences encoding the DNA binding motifs of other regulatory proteins. Five such proteins, DeoR, CytR, LRP, LuxR and PurR, were found to have actual operator or operator-like sequences in the DNA sequences encoding the DNA-binding motif. Except DeoR, all of them including GalS, are known to be auto-regulated. Auto-regulation in case of DeoR has not been investigated. Seven other proteins containing a HTH motif, do not have operator-like sequences in the DNA sequences encoding the HTH motif; none of them, except MerR, are known to be auto-regulated. The DNA binding proteins may have evolved from a common ancestor containing a DNA binding site within its gene segment that encodes the DNA-binding motif to facilitate auto-regulation. We have discussed current evidence for monophyletic or polyphyletic origin of such sequences.  相似文献   

5.
H Potter  D Dressler 《Gene》1986,48(2-3):229-239
A 'Southern Cross' hybridization method is described which permits the rapid restriction mapping of DNA molecules, up to 40 kb in size, for at least ten enzymes in a single operation. The procedure allows the full set of 32P-end-labelled fragments derived from one restriction enzyme digest to intersect and attempt to hybridize to the gel-separated fragments of as many as ten unlabelled digests immobilized on parallel sheets of filter paper. A two-dimensional array of hybridization spots is revealed on each recipient paper, indicating which radioactive and non-radioactive DNA fragments have sequences in common. A restriction map can then be directly and simply deduced from the matrix of hybridization spots in each cross-blot. The method affords advantages over other procedures for obtaining restriction maps in terms of the time required, the number of restriction enzymes that can be mapped, and the potential for eliminating ambiguity. It is also sufficiently sensitive to detect DNA rearrangements and restriction-site polymorphisms in moderately complex genomes. Furthermore, the procedure is applicable to other aspects of the study of genome organization: for example, the exon and intron areas of a segment of cloned genomic DNA can be identified by cross-hybridizing a set of radioactive restriction fragments from the genomic clone against immobilized RNA from a cell type of interest.  相似文献   

6.
Conserved segments in DNA or protein sequences are strong candidates for functional elements and thus appropriate methods for computing them need to be developed and compared. We describe five methods and computer programs for finding highly conserved blocks within previously computed multiple alignments, primarily for DNA sequences. Two of the methods are already in common use; these are based on good column agreement and high information content. Three additional methods find blocks with minimal evolutionary change, blocks that differ in at most k positions per row from a known center sequence and blocks that differ in at most k positions per row from a center sequence that is unknown a priori. The center sequence in the latter two methods is a way to model potential binding sites for known or unknown proteins in DNA sequences. The efficacy of each method was evaluated by analysis of three extensively analyzed regulatory regions in mammalian beta-globin gene clusters and the control region of bacterial arabinose operons. Although all five methods have quite different theoretical underpinnings, they produce rather similar results on these data sets when their parameters are adjusted to best approximate the experimental data. The optimal parameters for the method based on information content varied little for different regulatory regions of the beta-globin gene cluster and hence may be extrapolated to many other regulatory regions. The programs based on maximum allowed mismatches per row have simple parameters whose values can be chosen a priori and thus they may be more useful than the other methods when calibration against known functional sites is not available.  相似文献   

7.
The search for genes in a newly sequenced DNA is a well known problem. Among other factors, the gene-searching process is hampered by a number of ambiguities which may remain unresolved experimentally for a long time. A computer method that is able to predict genes in a DNA sequence containing ambiguities has been developed, based on the non-homogeneous Markov chain technique. The reliability of the method has been tested using a set of sequences generated by a Monte-Carlo procedure and a set of 425 E. coli sequences with ambiguities introduced artificially.  相似文献   

8.
Bony fishes (Osteichthyes) comprise over 22,000 species, about half of all vertebrate species. In order to investigate the phylogenetic relationships within this vertebrate class, we have studied the only protein whose primary structure is known in a rather large (27) number of fish species belonging to seven orders—the growth hormone. The phylogeny obtained using the maximum parsimony method based on amino acid sequences represents the first molecular phylogeny of teleostean fishes based on an extensive set of data. This phylogeny agrees remarkably well with the generally accepted phylogeny based on morphological characters and paleontological data. Correspondence to: G. Bernardi  相似文献   

9.
10.
The Escherichia coli Fis protein binds to specific DNA sequences whose base composition varies enormously. One known function of Fis is to stimulate site-specific DNA recombination. We used the Gin-mediated DNA inversion system of bacteriophage Mu to analyze Fis-DNA interaction. Efficient inversion requires an enhancer which consists of two Fis binding sites at a fixed distance from each other. Using mutant enhancers in which one of the Fis binding sites is replaced we show that Fis binds symmetrically to the DNA and we locate the center of symmetry. Furthermore, we show that one of the Fis binding sites can be replaced by a Fis binding site that normally functions in a process other than site-specific recombination.  相似文献   

11.
Kasai S  Yamazaki T 《Gene》2001,264(2):281-288
To confirm the presence of cobalamin-dependent methionine synthase (CDMS) in luminous bacteria, which is a prerequisite for the substantiation of our proposals on the physiological function of the lux operon, we identified the CDMS gene (metH) in Vibrio fischeri ATCC 7744. Two partial metH sequences, one located near the 5'-terminus of the gene and the other near the 3'-terminus, were sequenced by a PCR based method. To design a new set of PCR primers located on the two flanking regions of the gene, the genomic DNA was sequenced by SUGDAT method (sequencing using genomic DNA as a template) upstream or downstream from the respective partial gene sequences. Subsequently a 4.2 kb DNA fragment containing the whole metH was amplified by PCR and sequenced. The number of amino acid residues comprising the protein (1226 amino acids) was comparable to those of known CDMSs. The deduced amino acid sequence showed 85, 74, 55, 31, 30, 52, or 52% identity with that of Vibrio cholerae, Escherichia coli, Deinococcus radiodurans, Synechocystis PCC6803, Mycobacterium tuberculosis, Caenorhabditis elegans or Homo sapiens, respectively. All the predicted amino acid residues for the binding of cobalamin and S-adenosylmethionine were conserved. In the regulatory region of the V. fischeri metH, the binding site of the met repressor, MetJ, was present, although the site is atypically not present in E. coli metH or Salmonella typhimurium metH. It was shown that nucleotide sequences, even long ones, can be determined without a cloning step, if only parts of the DNA fragment to be sequenced are amplified by PCR.  相似文献   

12.
Relationships between gene trees and species trees   总被引:49,自引:10,他引:39  
It is well known that a phylogenetic tree (gene tree) constructed from DNA sequences for a genetic locus does not necessarily agree with the tree that represents the actual evolutionary pathway of the species involved (species tree). One of the important factors that cause this difference is genetic polymorphism in the ancestral species. Under the assumption of neutral mutations, this problem can be studied by evaluating the probability (P) that a gene tree has the same topology as that of the species tree. When one gene (allele) is used from each of the species involved, the probability can be expressed as a simple function of Ti = ti/(2N), where ti is the evolutionary time measured in generations for the ith internodal branch of the species tree and N is the effective population size. When any of the Ti's is less than 1, the probability P becomes considerably less than 1.0. This probability cannot be substantially increased by increasing the number of alleles sampled from a locus. To increase the probability, one has to use DNA sequences from many different loci that have evolved independently of each other.   相似文献   

13.
Detection of functional DNA motifs via statistical over-representation   总被引:14,自引:0,他引:14  
  相似文献   

14.
trees sifter 1.0 implements an approximate method to estimate the time to the most recent common ancestor (TMRCA) of a set of DNA sequences, using population evolution modelling. In essence, the program simulates genealogies with a user‐defined model of coalescence of lineages, and then compares each simulated genealogy to the genealogy inferred from the real data, through two summary statistics: (i) the number of mutations on the genealogy (Mn), and (ii) the number of different sequence types (alleles) observed (Kn). The simulated genealogies are then submitted to a rejection algorithm that keeps only those that are the most likely to have generated the observed sequence data. At the end of the process, the accepted genealogies can be used to estimate the posterior probability distribution of the TMRCA.  相似文献   

15.
The DNA-binding domain of Myb consists of three imperfect tandem repeats and the third one which is essential for sequence-specific binding was established to have a helix-turn-helix-related motif. DNA sequences recognized by Myb have been reported to contain TAACPy sequence. Here we have examined the details of Myb-binding sequence. Using DNAs with a single mutation on the various sites of two specific DNAs and some fragments of the DNA-binding domain of Myb, we have found that (i) in a specific DNA which contains only one AAC sequence, each AAC nucleotide is found to be essential for the specific binding of Myb, while any other mutations cause no serious binding loss, (ii) in a specific DNA which contains two AAC sequences separately, one AAC is not so important in the binding, and (iii) for the specific binding with DNA, at least both repeats 2 and 3 of Myb are required. These findings suggest that repeat 3 containing a helix-turn-helix-related structure recognizes the core AAC sequence and repeat 2 supports this recognition by interactions with phosphate groups of DNA.  相似文献   

16.
Sampling strategies for distances between DNA sequences   总被引:2,自引:0,他引:2  
B S Weir  C J Basten 《Biometrics》1990,46(3):551-582
An international effort is now underway to obtain the DNA sequence for the entire human genome (Watson and Jordan, 1989, Genomics 5, 654-656; Barnhart, 1989, Genomics 5, 657-660). This Human Genome Initiative will generate sequence data from several species other than humans, and will result in several copies per species of at least some regions of the genome. Although the project has generated much interest, it is but one aspect of the widespread effort to generate DNA sequence data. Published sequences are collected in common databases, and release 63 of GenBank in March 1990 contained 40,127,752 bases from 33,337 reported sequences (News from GenBank 3; Mountain View, California: Intelligenetics, Inc., 1990). Large though this database is, it is only about 1% of the number of bases in the human genome. Interpretations of data of such magnitude are going to require the collaborative efforts of biometricians and molecular biologists, and an aim of this paper is to show that there is also a role for readers of this journal in the design of surveys of DNA sequences. Discussion here will center on the use of sequence data in evolutionary studies, where some region of DNA is sequenced in several different species. The object is to infer the evolutionary history of that particular region, or of the species themselves. Statistical issues in the very important studies on sequences to locate and characterize regions responsible for human diseases will not be addressed here. We will discuss appropriate ways of measuring distances between DNA sequences and of predicting the sampling properties of the distances. There are procedures for inferring evolutionary histories for a set of elements that depend on a matrix of distances between each pair of elements, and the precision of resulting trees must be influenced by the precision of the distances. We will show that account needs to be taken of two sampling processes--the sampling of sequences by the investigator ("statistical sampling"), and the sampling of genetic material involved in the formation of offspring from a parental population ("genetic sampling").  相似文献   

17.
Seven cloned small circular DNA molecules from CHO cells were sequenced and examined for the presence of homologies to each other and to a number of other functional sequences present in transposable elements, retroviruses, mammalian repeat sequences, and introns. The sequences of the CHO cell circular DNA molecules did not reveal common structural features that could explain their presence in the circular DNA population. A gene bank was constructed for CHO chromosomal DNA and sequences homologous to two of the seven small circular DNA molecules were isolated and sequenced. The nucleotide sequences present at the junction of circular and chromosomal DNA suggest that a recombination process involving homologous pairing may have been involved in the generation of one, but not the other, of the two circular DNA molecules.  相似文献   

18.
The evolutionary conserved CCAAT binding protein NF-Y is a common regulatory DNA binding protein consisting of three distinct subunits. Unlike yeast and mammals, in which only a single copy of each subunit is encoded,Arabidopsis encodes a multi-gene family for each subunit in its genome. Compared with the NF-Y of mammals or yeast, very little is known about plant NF-Y homologs. HereArabidopsis NF-YA subunits were isolated to determine whether they could form a hete-rotrimeric NF-Y complex with mammalian NF-YB and NF-YC. This resultant chimeric NF-Y complex had DNA binding ability to the same CCAAT sequences as those of the other life systems. Therefore, it is possible that plant NF-Y homologs might have biochemical characteristics similar to mammalian NF-Y, thereby suggesting its functional conservation among organisms.  相似文献   

19.
20.
Eukaryotic chromosomal DNA replication is initiated by a highly conserved set of proteins that interact with cis-acting elements on chromosomes called replicators. Despite the conservation of replication initiation proteins, replicator sequences show little similarity from species to species in the small number of organisms that have been examined. Examination of replicators in other species is likely to reveal common features of replicators. We have examined a Kluyeromyces lactis replicator, KARS12, that functions as origin of DNA replication on plasmids and in the chromosome. It contains a 50-bp region with similarity to two other K. lactis replicators, KARS101 and the pKD1 replication origin. Replacement of the 50-bp sequence with an EcoRI site completely abrogated the ability of KARS12 to support plasmid and chromosomal DNA replication origin activity, demonstrating this sequence is a common feature of K. lactis replicators and is essential for function, possibly as the initiator protein binding site. Additional sequences up to 1 kb in length are required for efficient KARS12 function. Within these sequences are a binding site for a global regulator, Abf1p, and a region of bent DNA, both of which contribute to the activity of KARS12. These elements may facilitate protein binding, protein/protein interaction and/or nucleosome positioning as has been proposed for other eukaryotic origins of DNA replication.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号