首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The shufflon of plasmid R64 consists of four DNA segments separated and flanked by seven sfx recombination sites. Rci-mediated recombination between any inverted sfx sequences causes inversion of the DNA segments independently or in groups. The R64 shufflon selects one of seven pilV genes encoding type IV pilus adhesins, in which the N-terminal region is constant, while the C-terminal regions are variable. The R64 sfx sequences are asymmetric. The sfx central region and right arm sequences are conserved, but left arm sequences are not. Here we constructed a symmetric sfx sequence, in which the sfx left arm sequence was changed to the inverted repeat of the right arm sequence and made artificial shufflon segments carrying symmetric sfx sequences in inverted or direct orientations. The symmetric sfx sequence exhibited the highest inversion frequency in a shufflon segment flanked by two inverted sfx sequences. Rci-dependent deletion of a shufflon segment flanked by two direct symmetric sfx sequences was observed, suggesting that asymmetry of R64 sfx sequences inhibits recombination between direct sfx sequences. In addition, intermolecular recombination between symmetric sfx sequences was also observed. The extra C-terminal domain of Rci was shown to be essential for inversion of the R64 shufflon using asymmetric sfx sequences but not essential for recombination using symmetric sfx sequences, suggesting that the Rci C-terminal segment helps the binding of Rci to asymmetric sfx sequences. Rci protein lacking the C-terminal domain bound to both arms of symmetric sfx sequence but only to the right arm of asymmetric sfx sequence.  相似文献   

2.
Repetitive DNA sequences near immunoglobulin genes in the mouse genome (Steinmetz et al., 1980a,b) were characterized by restriction mapping and hybridization. Six sequences were determined that turned out to belong to a new family of dispersed repetitive DNA. From the sequences, which are called R1 to R6, a 475 base-pair consensus sequence was derived. The R family is clearly distinct from the mouse B1 family (Krayev et al., 1980). According to saturation hybridization experiments, there are about 100,000 R sequences per haploid genome, and they are probably distributed throughout the genome. The individual R sequences have an average divergence from the consensus sequence of 12.5%, which is largely due to point mutations and, among those, to transitions. Some R sequences are severly truncated. The R sequences extend into A-rich sequences and are flanked by short direct repeats. Also, two large insertions in the R2 sequence are flanked by direct repeats. In the neighbourhood of and within R sequences, stretches of DNA have been identified that are homologous to parts of small nuclear RNA sequences. Mouse satellite DNA-like sequences and members of the B1 family were also found in close proximity to the R sequences. The dispersion of R sequences within the mouse genome may be a consequence of transposition events. The possible role of the R sequences in recombination and/or gene conversion processes is discussed.  相似文献   

3.
Repetitive sequences constitute a significant component of most eukaryotic genomes, and the isolation and characterization of repetitive DNA sequences provide an insight into the organization and evolution of the genome of interest. We report the isolation and characterization of the major classes of repetitive sequences from the genome of Panax ginseng. The isolation of repetitive DNA from P. ginseng was achieved by the reannealing of chemically hydrolyzed (200 bp-1 kb fragments) and heat-denatured genomic DNA to low C(o)t value. The low C(o)t fraction was cloned, and fifty-five P. ginseng clones were identified that contained repetitive sequences. Sequence analysis revealed that the fraction includes repetitive telomeric sequences, species-specific satellite sequences, chloroplast DNA fragments and sequences that are homologous to retrotransposons. Two of the retrotransposon-like sequences are homologous to Ty1/ copia-type retroelements of Zea mays, and six cloned sequences are homologous to various regions of the del retrotransposon of Lilium henryi. The del retrotransposon-like sequences and several novel repetitive DNA sequences from P. ginseng were used to differentiate P. ginseng from P. quinquefolius, and should be useful for evolutionary studies of these disjunct species.  相似文献   

4.
The frequencies of "words", oligonucleotides within nucleotide sequences, reflect the genetic information contained in the sequence "texts". Nucleotide sequences are characteristically represented by their contrast word vocabularies. Comparison of the sequences by correlating their contrast vocabularies is shown to reflect well the relatedness (unrelatedness) between the sequences. A single value, the linguistic similarity between the sequences, is suggested as a measure of sequence relatedness. Sequences as short as 1000 bases can be characterized and quantitatively related to other sequences by this technique. The linguistic sequence similarity value is used for analysis of taxonomically and functionally diverse nucleotide sequences. The similarity value is shown to be very sensitive to the relatedness of the source species, thus providing a convenient tool for taxonomic classification of species by their sequence vocabularies. Functionally diverse sequences appear distinct by their linguistic similarity values. This can be a basis for a quick screening technique for functional characterization of the sequences and for mapping functionally distinct regions in long sequences.  相似文献   

5.
Goto N  Kurokawa K  Yasunaga T 《Gene》2007,401(1-2):172-180
To date, the complete genome sequences of more than 250 organisms have been determined. This information can now be used to determine whether there exist any invariant sequences that are conserved among all organisms, from bacteria to plants, animals, and humans. The existence of invariant sequences would strongly suggest that these sequences have been inherited unchanged from the last common ancestor of all life, and that they have essential functions. We have developed a new software program to identify invariant sequences conserved among the currently sequenced genomes and applied this analysis to the complete genome sequences of 266 organisms. We have identified 3 invariant DNA sequences longer than or equal to 11 bp and 6 invariant amino acid sequences longer than or equal to 6 aa. The longest invariant DNA sequence, AAGTCGTACAAGGT (15 bp), was found in the 16S/18S rRNA gene. Two 8 aa sequences, GHVDHGKT in IF2 and EF-Tu and DTPGHVDF in EF-G, were the longest invariant amino acid sequences detected. These sequences could be essential elements from the genome of the last common ancestor and may have remained unchanged throughout evolution.  相似文献   

6.
中国明对虾基因组小卫星重复序列分析   总被引:4,自引:0,他引:4  
高焕  孔杰 《动物学报》2005,51(1):101-107
通过对中国明对虾基因组随机DNA片断的测序 ,我们获得了总长度约 6 4 10 0 0个碱基的基因组DNA序列 ,从中共找到 172 0个重复序列。其中 ,小卫星序列的数目为 398个 ,占重复序列总数目的 2 3 14 %。这些小卫星序列的重复单位长度为 7- 16 5个碱基 ,集中分布于 7- 2 1个碱基范围内 ,其中以重复单位长度为 12个碱基的重复序列数目最多 ,为 5 8个 ,占小卫星重复序列总数目的 14 5 7%。不同拷贝数目所对应的重复序列的数目情况为 :拷贝数目为 2的重复单位所组成的重复序列数目最多 ,为 137个 ;其次是拷贝数目为 3的重复序列 ,为12 2个 ,且随着拷贝数目的增加 ,由其所组成的重复序列的数目呈递减的趋势。其中一部分序列见GeneBank数据库 ,登录号为AY6 990 72 -AY6 990 76。 398个重复序列分别由 398种重复单位所组成 ,因而小卫星重复序列的类型很多 ,我们初步分成三类 :两种碱基组成类别、三种碱基组成类别和四种碱基组成类别 ,并进一步根据各个重复序列中所含有的碱基种类的数量从大到小排列这些碱基而分成若干小类。从这些分类中可以看出 ,中国明对虾基因组中的小卫星整体上是富含A T的重复序列 ,并具有一定的“等级制度” ,揭示了其与微卫星重复序列之间的关系 ,即一部分小卫星重复序列可能起源于微卫星  相似文献   

7.
8.
The arrangements of inverted-repeated and repeated DNA sequences in the human genome have been investigated by an electron microscope method. The arrangement of the interspersed repeated DNA sequences is found to be similar to the corresponding arrangement found in Xenopus. This arrangement consists of 300-nucleotide-long repeated DNA sequences interspersed with roughly gene-size single-copy DNA sequences. The inverted-repeated sequences are also 300 nucleotides in length and are interspersed with the other DNA sequence classes.Most inverted-repeated sequences (64%) are spaced by another sequence which is recognized by electron microscopy as a single-stranded loop in a hairpin structure. The average length of this spacer loop is 1.6 kilobases. Although some pairs of inverted-repeated sequences are clustered, most seem to be randomly distributed throughout the genome. The average distance separating two pairs of inverted-repeated sequences is 10 to 20 kilobases. The interspersed repeated sequences and inverted-repeated sequences are arranged simultaneously in a portion of the human genome resulting in an interspersion of all three sequence classes.  相似文献   

9.
The phylogenetic diversity of an oligotrophic marine picoplankton community was examined by analyzing the sequences of cloned ribosomal genes. This strategy does not rely on cultivation of the resident microorganisms. Bulk genomic DNA was isolated from picoplankton collected in the north central Pacific Ocean by tangential flow filtration. The mixed-population DNA was fragmented, size fractionated, and cloned into bacteriophage lambda. Thirty-eight clones containing 16S rRNA genes were identified in a screen of 3.2 x 10(4) recombinant phage, and portions of the rRNA gene were amplified by polymerase chain reaction and sequenced. The resulting sequences were used to establish the identities of the picoplankton by comparison with an established data base of rRNA sequences. Fifteen unique eubacterial sequences were obtained, including four from cyanobacteria and eleven from proteobacteria. A single eucaryote related to dinoflagellates was identified; no archaebacterial sequences were detected. The cyanobacterial sequences are all closely related to sequences from cultivated marine Synechococcus strains and with cyanobacterial sequences obtained from the Atlantic Ocean (Sargasso Sea). Several sequences were related to common marine isolates of the gamma subdivision of proteobacteria. In addition to sequences closely related to those of described bacteria, sequences were obtained from two phylogenetic groups of organisms that are not closely related to any known rRNA sequences from cultivated organisms. Both of these novel phylogenetic clusters are proteobacteria, one group within the alpha subdivision and the other distinct from known proteobacterial subdivisions. The rRNA sequences of the alpha-related group are nearly identical to those of some Sargasso Sea picoplankton, suggesting a global distribution of these organisms.  相似文献   

10.
Abstract

The frequencies of “words”, oligonucleotides within nucleotide sequences, reflect the genetic information contained in the sequence “texts”. Nucleotide sequences are characteristically represented by their contrast word vocabularies. Comparison of the sequences by correlating their contrast vocabularies is shown to reflect well the relatedness (unrelatedness) between the sequences. A single value, the linguistic similarity between the sequences, is suggested asa measure of sequence relatedness. Sequences as short as 1000 bases can be characterized and quantitatively related to other sequences by this technique. The linguistic sequence similarity value is used for analysis of taxonomically and functionally diverse nucleotide sequences. The similarity value is shown to be very sensitive to the relatedness of the source species, thus providing a convenient tool for taxonomic classification of species by their sequence vocabularies. Functionally diverse sequences appear distinct by their linguistic similarity values. This can be a basis for a quick screening technique for functional characterization of the sequences and for mapping functionally distinct regions in long sequences.  相似文献   

11.
The comparisons of 170 sequences of kinetoplast DNA minicircle hypervariable region obtained from 19 stocks of Trypanosoma cruzi and 2 stocks of Trypanosoma cruzi marenkellei showed that only 56% exhibited a significant homology one with other sequences. These sequences could be grouped into homology classes showing no significant sequence similarity with any other homology group. The 44% remaining sequences thus corresponded to unique sequences in our data set. In the DTU I ("Discrete Typing Units") 51% of the sequences were unique. In contrast, in the DTU IId, 87.5% of sequences were distributed into three classes. The results obtained for T. cruzi marinkellei, showed that all sequences were unique, without any similarity between them and T. cruzi sequences. Analysis of palindromes in all sequence sets show high frequency of the EcoRI site. Analysis of repetitive sequences suggested a common ancestral origin of the kDNA. The editing mechanism that occurs in kinetoplastidae is discussed.  相似文献   

12.
The RNA genome of the Moloney isolate of murine sarcoma virus (M-MSV) consists of two parts--a sarcoma-specific region with no homology to known leukemia viral RNAs, and a shared region present also in Moloney murine leukemia virus RNA. Complementary DNA was isolated which was specific for each part of the M-MSV genome. The DNA of a number of mammalian species was examined for the presence of nucleotide sequences homologous with the two M-MSV regions. Both sets of viral sequences had homologous nucleotide sequences present in normal mouse cellular DNA. MSV-specific sequences found in mouse cellular DNA closely matched those nucleotide sequences found in M-MSV as seen by comparisons of thermal denaturation profiles. In all normal mouse cells tested, the cellular set of M-MSV-specific nucleotide sequences was present in DNA as one to a few copies per cell. The rate of base substitution of M-MSV nucleotide sequences was compared with the rate of evolution of both unique sequences and the hemoglobin gene of various species. Conservation of MSV-specific nucleotide sequences among species was similar to that of mouse globin gene(s) and greater than that of average unique cellular sequences. In contrast, cellular nucleotide sequences that are homologous to the M-MSV-murine leukemia virus "common" nucleotide region were present in multiple copies in mouse cells and were less well matched, as seen by reduced melting profiles of the hybrids. The cellular common nucleotide sequences diverged very rapidly during evolution, with a base substitution rate similar to that reported for some primate and avian endogenous virogenes. The observation that two sets of covalently linked viral sequences evolved at very different rates suggests that the origin of M-MSV may be different from endogenous helper viruses and that cellular sequences homologous to MSV-specific nucleotide sequences may be important to survival.  相似文献   

13.
We describe a new computer program that identifies conserved secondary structures in aligned nucleotide sequences of related single-stranded RNAs. The program employs a series of hash tables to identify and sort common base paired helices that are located in identical positions in more than one sequence. The program gives information on the total number of base paired helices that are conserved between related sequences and provides detailed information about common helices that have a minimum of one or more compensating base changes. The program is useful in the analysis of large biological sequences. We have used it to examine the number and type of complementary segments (potential base paired helices) that can be found in common among related random sequences similar in base composition to 16S rRNA from Escherichia coli. Two types of random sequences were analyzed. One set consisted of sequences that were independent but they had the same mononucleotide composition as the 16S rRNA. The second set contained sequences that were 80% similar to one another. Different results were obtained in the analysis of these two types of random sequences. When 5 sequences that were 80% similar to one another were analyzed, significant numbers of potential helices with two or more independent base changes were observed. When 5 independent sequences were analyzed, no potential helices were found in common. The results of the analyses with random sequences were compared with the number and type of helices found in the phylogenetic model of the secondary structure of 16S ribosomal RNA. Many more helices are conserved among the ribosomal sequences than are found in common among similar random sequences. In addition, conserved helices in the 16S rRNAs are, on the average, longer than the complementary segments that are found in comparable random sequences. The significance of these results and their application in the analysis of long non-ribosomal nucleotide sequences is discussed.  相似文献   

14.
MOTIVATION: The completion of the Arabidopsis genome offers the first opportunity to analyze all of the membrane protein sequences of a plant. The majority of integral membrane proteins including transporters, channels, and pumps contain hydrophobic alpha-helices and can be selected based on TransMembrane Spanning (TMS) domain prediction. By clustering the predicted membrane proteins based on sequence, it is possible to sort the membrane proteins into families of known function, based on experimental evidence or homology, or unknown function. This provides a way to identify target sequences for future functional analysis. RESULTS: An automated approach was used to select potential membrane protein sequences from the set of all predicted proteins and cluster the sequences into related families. The recently completed sequence of Arabidopsis thaliana, a model plant, was analyzed. Of the 25,470 predicted protein sequences 4589 (18%) were identified as containing two or more membrane spanning domains. The membrane protein sequences clustered into 628 distinct families containing 3208 sequences. Of these, 211 families (1764 sequences) either contained proteins of known function or showed homology to proteins of known function in other species. However, 417 families (1444 sequences) contained only sequences with no known function and no homology to proteins of known function. In addition, 1381 sequences did not cluster with any family and no function could be assigned to 1337 of these.  相似文献   

15.
Comparing DNA or protein sequences plays an important role in the functional analysis of genomes. Despite many methods available for sequences comparison, few methods retain the information content of sequences. We propose a new approach, the Yau-Hausdorff method, which considers all translations and rotations when seeking the best match of graphical curves of DNA or protein sequences. The complexity of this method is lower than that of any other two dimensional minimum Hausdorff algorithm. The Yau-Hausdorff method can be used for measuring the similarity of DNA sequences based on two important tools: the Yau-Hausdorff distance and graphical representation of DNA sequences. The graphical representations of DNA sequences conserve all sequence information and the Yau-Hausdorff distance is mathematically proved as a true metric. Therefore, the proposed distance can preciously measure the similarity of DNA sequences. The phylogenetic analyses of DNA sequences by the Yau-Hausdorff distance show the accuracy and stability of our approach in similarity comparison of DNA or protein sequences. This study demonstrates that Yau-Hausdorff distance is a natural metric for DNA and protein sequences with high level of stability. The approach can be also applied to similarity analysis of protein sequences by graphic representations, as well as general two dimensional shape matching.  相似文献   

16.
The nucleic acid sequences found in DNA and RNA from rat cells which are homologous to Kirsten sarcoma virus have been characterized. The homologous sequences are present in multiple copies per diploid rat cellular genome in a variety of different rat cellular dna's. In certain cells that constitutively express only low levels of sequences homologous to Kirsten sarcoma virus, bromodeoxyuridine treatment leads to the expression of high levels of these sequences in RNA. Supernatants from cell lines producing the sequences homologous to Kirsten sarcoma virus contain high levels of these sequences which are purified to the same degree as the previously known rat type C viral nucleic acid sequences by type C particles being released from such cells. The results indicate that the sequences in rat cells homologous to Kisten sarcoma virus have three characteristics of known mammalian type C viruses, and suggest that at least part of Kirsten sarcoma virus rat-derived sequences represent a distinct class of endogenous rat type C virus that has no detectable homology to the other known class of endogenous rat type C virus.  相似文献   

17.
18.
The uRNA database.   总被引:3,自引:0,他引:3       下载免费PDF全文
The uRNADB offers aligned, annotated and phylogenetically ordered sequences of several U RNAs. New to this release are RNAs from U7 (14 sequences), U8 (two sequences), U11 (three sequences), U12 (two sequences), U14 (11 sequences), U18, U48 and U49. A total of 34 new sequences were aligned with the previously compiled snRNAs U1, U2, U3, U4, U5 and U6.  相似文献   

19.
The diversity of serine proteases secreted from Chrysomya bezziana larvae was investigated biochemically and by PCR and sequence analysis. Cation-exchange chromatography of purified larval serine proteases resolved four trypsin-like activities and three chymotrypsin-like activities as discerned by kinetic studies with benzoyl-Arg-p-nitroanilide and succinyl-Ala-Ala-Pro-Phe-p-nitroanilide. Amino-terminal sequencing of the three most abundant fractions gave two sequences, which were homologous to other Dipteran trypsins and chymotrypsins. Analysis of products generated by PCR of cDNA from whole larvae using specific primers based on the amino-terminal sequences and generic serine protease primers identified 22 different sequences, while phylogenetic analysis of the deduced amino acid sequences differentiated two trypsin-like and four chymotrypsin-like families. Phylogenetic comparisons with Dipteran and mammalian serine protease sequences showed that all the Chrysomya bezziana sequences clustered with Dipteran sequences. The Chrysomya bezziana chymotrypsin-like sequences segregated within a Dipteran cluster of chymotrypsin sequences, but were well dispersed amongst these sequences. The largest Chrysomya bezziana serine protease family, the trypB family, clustered tightly as a group, and was closely related to a Lucilia cuprina trypsin but distinct from Drosophila melanogaster alpha and beta trypsins. The trypB family contains ten highly homologous sequences and probably represents an example of concerted evolution of a trypsin gene in Chrysomya bezziana.  相似文献   

20.
应用系统发育树分析DDBJ基因库中HBV基因序列的基因型   总被引:1,自引:0,他引:1  
为了充分利用核酸库中的HBV序列信息,探讨DDBJ核酸库中HBV基因序列的基因分型,采用Clustal X(1.8)软件比较HBV基因序列前S区序列差异并产生系统发育树。通过对下载的1471条HBV基因序列进行系统分析。获得了228条前S/S区完整的HBV基因序列,其中有66条序列的基因型已被各种方法所证实。利用软件分析绘制了基于228条HBV前S区基因序列的系统发育树。66条已知基因型HBV基因序列在系统发育树上的分型与其原有基因型完全吻合。在228条HBV基因序列中,有207条序列分别属于A、B、C、D、E、F和G等7个基因型,但另外21条序列不能归属于上述7个基因型的任何一种,而且它们又分为彼此相互独立存在的两群,暂分别称之为未分型I和未分型Ⅱ,经比较未分型I、Ⅱ和其他7个基因型前S区核苷酸序列,发现未分型I、Ⅱ和D型前S区都有33个核苷酸缺失,但三者基因缺失片段的位置和形式各不相同,但其它六型前S区无大片段基因缺失。结果说明采用基于前S区的系统发育树基因分型分析方法正确可靠,除了现已证实的7个基因型外,尚可能存在另外两个新的HBV基因型。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号