首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
中国明对虾基因组小卫星重复序列分析   总被引:4,自引:0,他引:4  
高焕  孔杰 《动物学报》2005,51(1):101-107
通过对中国明对虾基因组随机DNA片断的测序 ,我们获得了总长度约 6 4 10 0 0个碱基的基因组DNA序列 ,从中共找到 172 0个重复序列。其中 ,小卫星序列的数目为 398个 ,占重复序列总数目的 2 3 14 %。这些小卫星序列的重复单位长度为 7- 16 5个碱基 ,集中分布于 7- 2 1个碱基范围内 ,其中以重复单位长度为 12个碱基的重复序列数目最多 ,为 5 8个 ,占小卫星重复序列总数目的 14 5 7%。不同拷贝数目所对应的重复序列的数目情况为 :拷贝数目为 2的重复单位所组成的重复序列数目最多 ,为 137个 ;其次是拷贝数目为 3的重复序列 ,为12 2个 ,且随着拷贝数目的增加 ,由其所组成的重复序列的数目呈递减的趋势。其中一部分序列见GeneBank数据库 ,登录号为AY6 990 72 -AY6 990 76。 398个重复序列分别由 398种重复单位所组成 ,因而小卫星重复序列的类型很多 ,我们初步分成三类 :两种碱基组成类别、三种碱基组成类别和四种碱基组成类别 ,并进一步根据各个重复序列中所含有的碱基种类的数量从大到小排列这些碱基而分成若干小类。从这些分类中可以看出 ,中国明对虾基因组中的小卫星整体上是富含A T的重复序列 ,并具有一定的“等级制度” ,揭示了其与微卫星重复序列之间的关系 ,即一部分小卫星重复序列可能起源于微卫星  相似文献   

2.
Inverted repeats have been found to occur in both prokaryotic and eukaryotic genomes. Usually they are short and some have important functions in various biological processes. However, long inverted repeats are rare and can cause genome instability. Analyses of C. elegans genome identified long, nearly-perfect inverted repeat sequences involving both divergently and convergently oriented homologous gene pairs and complete intergenic sequences. Comparisons with the orthologous regions from the genomes of C. briggsae and C. remanei show that the inverted repeat structures are often far more conserved than the sequences. This observation implies that there is an active mechanism for maintaining the inverted repeat nature of the sequences.  相似文献   

3.
We study the length distribution functions for the 16 possible distinct dimeric tandem repeats in DNA sequences of diverse taxonomic partitions of GenBank (known human and mouse genomes, and complete genomes of Caenorhabditis elegans and yeast). For coding DNA, we find that all 16 distribution functions are exponential. For non-coding DNA, the distribution functions for most of the dimeric repeats have surprisingly long tails, that fit a power-law function. We hypothesize that: (i) the exponential distributions of dimeric repeats in protein coding sequences indicate strong evolutionary pressure against tandem repeat expansion in coding DNA sequences; and (ii) long tails in the distributions of dimers in non-coding DNA may be a result of various mutational mechanisms. These long, non-exponential tails in the distribution of dimeric repeats in non-coding DNA are hypothesized to be due to the higher tolerance of non-coding DNA to mutations. By comparing genomes of various phylogenetic types of organisms, we find that the shapes of the distributions are not universal, but rather depend on the specific class of species and the type of a dimer.  相似文献   

4.
The nucleotide sequence of Korean ginseng (Panax schinseng Nees) chloroplast genome has been completed (AY582139). The circular double-stranded DNA, which consists of 156,318 bp, contains a pair of inverted repeat regions (IRa and IRb) with 26,071 bp each, which are separated by small and large single copy regions of 86,106 bp and 18,070 bp, respectively. The inverted repeat region is further extended into a large single copy region which includes the 5' parts of the rpsl9 gene. Four short inversions associated with short palindromic sequences that form stem-loop structures were also observed in the chloroplast genome of P. schinseng compared to that of Nicotiana tabacum. The genome content and the relative positions of 114 genes (75 peptide-encoding genes, 30 tRNA genes, 4 rRNA genes, and 5 conserved open reading frames [ycfs]), however, are identical with the chloroplast DNA of N. tabacum. Sixteen genes contain one intron while two genes have two introns. Of these introns, only one (trnL-UAA) belongs to the self-splicing group I; all remaining introns have the characteristics of six domains belonging to group II. Eighteen simple sequence repeats have been identified from the chloroplast genome of Korean ginseng. Several of these SSR loci show infra-specific variations. A detailed comparison of 17 known completed chloroplast genomes from the vascular plants allowed the identification of evolutionary modes of coding segments and intron sequences, as well as the evaluation of the phylogenetic utilities of chloroplast genes. Furthermore, through the detailed comparisons of several chloroplast genomes, evolutionary hotspots predominated by the inversion end points, indel mutation events, and high frequencies of base substitutions were identified. Large-sized indels were often associated with direct repeats at the end of the sequences facilitating intra-molecular recombination.  相似文献   

5.
This work describes the organization, at the nucleotide sequence level, of genes flanking the junctions of the large single copy regions and the inverted repeats of Spinacia oleracea (spinach) and Nicotiana debneyi chloroplast DNAs. In both genomes, trnH1, the gene for tRNA-His(GUG) is located at the extremity of the large single copy region 3' to psbA, the gene for the 35 kd Photosystem 2 protein. Both psbA and trnH1 are transcribed towards the inverted repeat. In spinach, the first 48 codons of rps19, the gene for the chloroplast ribosomal protein S19, lie in the inverted repeat and the last 44 codons lie in the large single copy region at the end opposite to that carrying trnH1. The gene for a protein homologous to the E. coli ribosomal protein L2, rp12, is in the inverted repeat immediately 5' to rps19 and, like rps19, is transcribed towards the large single copy region. In N. debneyi, but not in spinach, rp12 is interrupted by a 666 bp insertion. The gene for tRNA-lle(CAT), trnl1, is located in the inverted repeats of spinach and N. debneyi, 5' to rp12 and is transcribed in the same direction as rp12.  相似文献   

6.
Cheung AK 《Journal of virology》2004,78(17):9016-9029
Palindromic sequences (inverted repeats) flanking the origin of DNA replication with the potential of forming single-stranded stem-loop cruciform structures have been reported to be essential for replication of the circular genomes of many prokaryotic and eukaryotic systems. In this study, mutant genomes of porcine circovirus with deletions in the origin-flanking palindrome and incapable of forming any cruciform structures invariably yielded progeny viruses containing longer and more stable palindromes. These results suggest that origin-flanking palindromes are essential for termination but not for initiation of DNA replication. Detection of template strand switching in the middle of an inverted repeat strand among the progeny viruses demonstrated that both the minus genome and a corresponding palindromic strand served as templates simultaneously during DNA biosynthesis and supports the recently proposed rolling-circle "melting-pot" replication model. The genome configuration presented by this model, a four-stranded tertiary structure, provides insights into the mechanisms of DNA replication, inverted repeat correction (or conversion), and illegitimate recombination of any circular DNA molecule with an origin-flanking palindrome.  相似文献   

7.
串联重复序列的物种差异及其生物功能   总被引:13,自引:0,他引:13  
高焕  孔杰 《动物学研究》2005,26(5):555-564
串联重复序列是指1-200个碱基左右的核心重复单位,以头尾相串联的方式重复多次所组成的重 复序列。它广泛存在于真核生物和一些原核生物的基因组中,并表现出种属、碱基组成等的特异性。在基因组 整体水平上,各种优势的重复序列类型不同。即使在同一重复序列类型内部,不同重复拷贝类别(如AT、AC 等)在基因组中的存在也表现出很大的差异。同时,这些重复序列类型和各重复拷贝类别在同一物种的不同染 色体间,以及基因的编码区和非编码区间也表现种属和碱基组成差异。这些差异显示了重复序列起源和进化的 复杂性,可能涉及到多种机制和因素,并与生物功能密切相关。另外,由于重复序列分析软件和统计标准还存 在算法、重复长度、完美性等问题,需要进一步探讨。此外,串联重复序列的自身进化关系、全基因组水平上 的进化地位、在基因组中的生物功能、重复序列数据库建立和应用研究等,将是今后研究的主要课题。  相似文献   

8.
Direct or inverse repeated sequences are important functional features of prokaryotic and eukaryotic genomes. Considering the unique mechanism, involving single-stranded genomic intermediates, by which adenovirus (Ad) replicates its genome, we investigated whether repetitive homologous sequences inserted into E1-deleted adenoviral vectors would affect replication of viral DNA. In these studies we found that inverted repeats (IRs) inserted into the E1 region could mediate predictable genomic rearrangements, resulting in vector genomes devoid of all viral genes. These genomes (termed DeltaAd.IR) contained only the transgene cassette flanked on both sides by precisely duplicated IRs, Ad packaging signals, and Ad inverted terminal repeat sequences. Generation of DeltaAd.IR genomes could also be achieved by coinfecting two viruses, each providing one inverse homology element. The formation of DeltaAd.IR genomes required Ad DNA replication and appeared to involve recombination between the homologous inverted sequences. The formation of DeltaAd. IR genomes did not depend on the sequence within or adjacent to the inverted repeat elements. The small DeltaAd.IR vector genomes were efficiently packaged into functional Ad particles. All functions for DeltaAd.IR replication and packaging were provided by the full-length genome amplified in the same cell. DeltaAd.IR vectors were produced at a yield of approximately 10(4) particles per cell, which could be separated from virions with full-length genomes based on their lighter buoyant density. DeltaAd.IR vectors infected cultured cells with the same efficiency as first-generation vectors; however, transgene expression was only transient due to the instability of deleted genomes within transduced cells. The finding that IRs present within Ad vector genomes can mediate precise genetic rearrangements has important implications for the development of new vectors for gene therapy approaches.  相似文献   

9.
Summary Our recent physical mapping of chloroplast DNA (cpDNA) from Chlamydomonas moewusii, a unicellular green alga which is interfertile with Chlamydomonas eugametos, has revealed a two-fold size difference between the inverted repeat sequences of these algae. With a size of 42 kbp, the inverted repeat of C. moewusii is the largest yet identified in any chloroplast genome. Here we have compared the arrangement of conserved sequences within the two algal inverted repeats by hybridizing cloned restriction fragments representing over 90% of these repeats to Southern blots of cpDNA digests from the two algae. We found that the size difference between the two algal inverted repeats is due to the presence of an extra DNA segment of 21 kilobase pairs (kbp) in C. moewusii. Except for this sequence, the C. moewusii inverted repeat is highly homologous to the entire C. eugametos repeat and the arrangement of conserved sequences in the two repeats is identical. Southern hybridizations with specific gene probes revealed that the conserved sequences include the rDNA region and the genes coding for the large subunit of ribulose 1,5 bisphosphate carboxylase-oxygenase (rbcL) and for the 32 kilodalton thylakoid membrane protein (psbA). With respect to the conserved sequences, the extra 21 kbp DNA segment of C. moewusii lies in the region of psbA, most probably slightly downstream from this gene.  相似文献   

10.
The chloroplast genome sequence of Coffea arabica L., the first sequenced member of the fourth largest family of angiosperms, Rubiaceae, is reported. The genome is 155 189 bp in length, including a pair of inverted repeats of 25 943 bp. Of the 130 genes present, 112 are distinct and 18 are duplicated in the inverted repeat. The coding region comprises 79 protein genes, 29 transfer RNA genes, four ribosomal RNA genes and 18 genes containing introns (three with three exons). Repeat analysis revealed five direct and three inverted repeats of 30 bp or longer with a sequence identity of 90% or more. Comparisons of the coffee chloroplast genome with sequenced genomes of the closely related family Solanaceae indicated that coffee has a portion of rps19 duplicated in the inverted repeat and an intact copy of infA . Furthermore, whole-genome comparisons identified large indels (> 500 bp) in several intergenic spacer regions and introns in the Solanaceae, including trnE (UUC)– trnT (GGU) spacer, ycf4 – cemA spacer, trnI (GAU) intron and rrn5 – trnR (ACG) spacer. Phylogenetic analyses based on the DNA sequences of 61 protein-coding genes for 35 taxa, performed using both maximum parsimony and maximum likelihood methods, strongly supported the monophyly of several major clades of angiosperms, including monocots, eudicots, rosids, asterids, eurosids II, and euasterids I and II. Coffea (Rubiaceae, Gentianales) is only the second order sampled from the euasterid I clade. The availability of the complete chloroplast genome of coffee provides regulatory and intergenic spacer sequences for utilization in chloroplast genetic engineering to improve this important crop.  相似文献   

11.
12.
Protein sequences are normally the most conserved elements of genomes owing to purifying selection to maintain their functions. We document an extraordinary amount of within-species protein sequence variation in the model eukaryote Dictyostelium discoideum stemming from triplet DNA repeats coding for long strings of single amino acids. D. discoideum has a very large number of such strings, many of which are polyglutamine repeats, the same sequence that causes various human neurological disorders in humans, like Huntington’s disease. We show here that D. discoideum coding repeat loci are highly variable among individuals, making D. discoideum a candidate for the most variable proteome. The coding repeat loci are not significantly less variable than similar non-coding triplet repeats. This pattern is consistent with these amino-acid repeats being largely non-functional sequences evolving primarily by mutation and drift.  相似文献   

13.
Behura SK  Severson DW 《Gene》2012,504(2):226-232
We present a detailed genome-scale comparative analysis of simple sequence repeats within protein coding regions among 25 insect genomes. The repetitive sequences in the coding regions primarily represented single codon repeats and codon pair repeats. The CAG triplet is highly repetitive in the coding regions of insect genomes. It is frequently paired with the synonymous codon CAA to code for polyglutamine repeats. The codon pairs that are least repetitive code for polyalanine repeats. The frequency of hexanucleotide and dinucleotide motifs of codon pair repeats is significantly (p<0.001) different in the Drosophila species compared to the non-Drosophila species. However, the frequency of synonymous and non-synonymous codon pair repeats varies in a correlated manner (r(2)=0.79) among all the species. Results further show that perfect and imperfect repeats have significant association with the trinucleotide and hexanucleotide coding repeats in most of these insects. However, only select species show significant association between the numbers of perfect/imperfect hexamers and repeat coding for single amino acid/amino acid pair runs. Our data further suggests that genes containing simple sequence coding repeats may be under negative selection as they tend to be poorly conserved across species. The sequences of coding repeats of orthologous genes vary according to the known phylogeny among the species. In conclusion, the study shows that simple sequence coding repeats are important features of genome diversity among insects.  相似文献   

14.
15.
Mutational analysis of the inverted repeats of Tn3   总被引:1,自引:0,他引:1  
The transposase protein and the terminal inverted repeat sequences of the prokaryotic transposon Tn3 are essential for transposition. In order to determine the sequences within the inverted repeat necessary for transposition and interaction with transposase, we have constructed a series of mini-Tn3s in which specific mutations have been introduced into the inverted repeats. The effects of these mutations on transposition have been assayed in vivo using a mating-out transposition assay. Several single base-pair mutations within the transposase binding site reduce transposition frequency. Mutations that affect transposition show a greater effect when present in both inverted repeats than when present in only one inverted repeat.  相似文献   

16.
在紫云英根瘤菌(Rhizobium astragali)的基因组中存在有DNA重复顺序(RSRa)。它在Ra159的基因组中重复4~5次,其中一个拷贝位于nifH基因的上游。以1.25kbPvul片段作探针,在其他紫云英根瘤菌菌株及豌豆根瘤菌RI PRE中也都检测到与RSRa同源的DNA片段。序列测定的结果表明RSRa其结构类似于IS因子,具有原核插入顺序的一些特点。RSRa全长1468bp,在RSRa的两个末端具有反向重复顺序,RSRa中有一个大的开放阅读框架(ORF)。由ORF推定的蛋白与大肠杆菌插入顺序IS903推定的转座酶有较高的同源性。  相似文献   

17.
Eucaryotic transposable genetic elements with inverted terminal repeats   总被引:22,自引:0,他引:22  
S Potter  M Truett  M Phillips  A Maher 《Cell》1980,20(3):639-647
DNA carrying inverted repeats was tested for transposition within the Drosophila genome. Five Bam HI segments containing related inverted repeats were isolated from D. melanogaster and analyzed by electron microscopy and restriction mapping. Southern blot experiments using single-copy flanking sequences as probes allowed the study of DNA arrangements at specific sites in the genomes of five closely related strains. We found that in some genomes the sequences with inverted repeats were present at a particular site, whereas in other genomes they were absent from this site. These results indicated that three of the sequences are transposable genetic elements. In one case we have purified the two corresponding DNA segments, with and without the sequence containing inverted repeats, thereby confirming the mobility of this sequence. These DNA elements were found to be distinct in two ways from copia and others previously described: first, they contain inverted terminal repeats, and second, they have a more heterogeneous construction.  相似文献   

18.
19.
Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21-37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer "immunity" against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.  相似文献   

20.
Two novel families of miniature inverted repeat transposable elements (MITEs), Vege and Mar, are described from Drosophila willistoni. Based on their structures, both element families are hypothesized to belong to the hAT superfamily of transposable elements. Both elements have perfect, inverted terminal repeats and 8-bp target site duplications and were found to have inserted within fixed copies of nonautonomous P elements. Vege is present in all studied D. willistoni populations and appears to have a relatively low copy number. Mar was identified in only a single D. willistoni population, and its copy number is presently unknown. Although MITEs occupy relatively large proportions of the genomes of a broad range of organisms, this may be their first unambiguous identification in any species of the genus Drosophila.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号