首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Successful integration of viral genome into a host chromosome depends on interaction between viral integrase and its recognition sequences. We have used a reconstituted concerted human immunodeficiency virus, type 1 (HIV-1), integration system to analyze the role of integrase (IN) recognition sequences in formation of the IN-viral DNA complex capable of concerted integration. HIV-1 integrase was presented with substrates that contained all 4 bases at 8 mismatched positions that define the inverted repeat relationship between U3 and U5 long terminal repeats (LTR) termini and at positions 17-19, which are conserved in the termini. Evidence presented indicates that positions 17-20 of the IN recognition sequences are needed for a concerted DNA integration mechanism. All 4 bases were found at each randomized position in sequenced concerted DNA integrants, although in some instances there were preferences for specific bases. These results indicate that integrase tolerates a significant amount of plasticity as to what constitutes an IN recognition sequence. By having several positions randomized, the concerted integrants were examined for statistically significant relationships between selections of bases at different positions. The results of this analysis show not only relationships between different positions within the same LTR end but also between different positions belonging to opposite DNA termini.  相似文献   

2.
The nucleotide sequence of the 3' long terminal repeat and adjacent viral and host sequences was determined for a bovine leukemia provirus cloned from a bovine tumor. The long terminal repeat was found to comprise 535 nucleotides and to harbor at both ends an imperfect inverted repeat of 7 bases. Promoter-like sequences (Hogness box and CAT box), an mRNA capping site, and a core enhancer-related sequence were tentatively located. No kinship was detected between this bovine leukemia proviral fragment and other retroviral long terminal repeats, including that of human T-cell leukemia virus.  相似文献   

3.
D Jenne  K K Stanley 《Biochemistry》1987,26(21):6735-6742
The S-protein/vitronectin gene was isolated from a human genomic DNA library, and its sequence of about 5.3 kilobases including the adjacent 5' and 3' flanking regions was established. Alignment of the genomic DNA nucleotide sequence and the cDNA sequence indicated that the gene consisted of eight exons and seven introns. The intron positions in the S-protein gene and their phase type were compared to those in the hemopexin gene which shares amino acid sequence homologies with transin and the S-protein. Three introns have been found at equivalent positions; two other introns are very close to these positions and are interpreted as cases of intron sliding. Introns 3-7 occur at a conserved glycine residue within repeating peptide segments, whereas introns 1 and 2 are at the boundaries of the Somatomedin B domain of S-protein. The analysis of the exon structure in relation to repeating peptide motifs within the S-protein strongly suggests that it contains only seven repeats, one less than the hemopexin molecule. A very similar repeat pattern like that in hemopexin is shown to be present also in two other related proteins, transin and interstitial collagenase. An evolutionary model for the generation of the repeat pattern in the S-protein and the other members of this novel "pexin" gene family is proposed, and the sequence modifications for some of the repeats during divergent evolution are discussed in relation to known unique functional properties of hemopexin and S-protein.  相似文献   

4.
Mack AM  Crawford NM 《The Plant cell》2001,13(10):2319-2332
The in vitro DNA binding activity of the Arabidopsis Tag1 transposase (TAG1) was characterized to determine the mechanism of DNA recognition. In addition to terminal inverted repeats, the Tag1 element contains four different subterminal repeats that flank a transcribed region encoding a 729-amino acid protein. A single site-specific DNA binding domain is located near the N terminus of TAG1, between residues 21 and 133. This domain binds specifically to the AAACCC and TGACCC subterminal repeats, found near the 5' and 3' ends of the element, respectively. The ACCC sequence within these repeats is critical for recognition because mutations at positions 3, 5, and 6 abolished binding, yet the first two bases also are important because substitutions at these positions decreased binding by up to 90%. Weak interaction also occurs with the terminal inverted repeats, but no binding was observed to the other two 3' subterminal repeat regions. Sequence analysis of the TAG1 DNA binding domain revealed a C(2)HC zinc finger motif. Tests for metal dependence showed that DNA binding activity was inhibited by divalent metal chelators and greatly enhanced by zinc. Furthermore, mutation of each cysteine residue predicted to be a metal ligand in the C(2)HC motif abolished DNA binding. Together, these data show that the DNA binding domain of TAG1 specifically binds to distinct subterminal repeats and contains a zinc finger.  相似文献   

5.
Mononucleotide repeats (MNRs) have been systematically investigated in the genomes of eukaryotic and prokaryotic organisms. However, detailed information on the distribution of MNRs in viral genomes is limited. In this study, we examined the distributions of MNRs in 256 fully sequenced virus genomes which showed extensive variations across viral genomes, and is significantly influenced by both genome size and CG content. Furthermore, the ratio of the observed to the expected number of MNRs (O/E ratio) appears to be influenced by both the host range and genome type of a particular virus. Additionally, the densities and frequencies of MNRs in genic regions are lower than in non-coding regions, suggesting that selective pressure acts on viral genomes. We also discuss the potential functional roles that these MNR loci could play in virus genomes. To our knowledge, this is the first analysis focusing on MNRs in viruses, and our study could have potential implications for a deeper understanding of virus genome stability and the co-evolution that occurs between a virus and its host.  相似文献   

6.
Microsatellites are simple sequence repeats (SSRs) showing complex patterns of length, motif sizes, motif sequences, and repeat perfection. We studied the structure of the dinucleotide SSR population at the genome level by analyzing assembled DNA sequence across species. Three dinucleotide populations were distinguished when SSR genome frequency was analyzed as a function of repeat length and repeat perfection. A population of low-perfection SSRs was identified, which is constituted by short repeats and represents the vast majority of genomic dinucleotide SSRs across eukaryotic genomes. In turn, the highly perfect repeats are 30 to 50 times less frequent and, in addition to short repeats, also contain a long repeat population that is uniquely represented in vertebrate species. Distinctive features of this population include the modal peak in the frequency distribution of repeat length and the strong preferential usage of the repeat motifs AC and AG. These results raise the hypothesis that the ability of carrying a distinct population of long, highly perfect dinucleotide repeats in the genome is a late acquisition in chordate evolution. Our analysis also suggests that different dinucleotide repeat populations have different dynamics and are likely to be underlined by different molecular mechanisms of generation and maintenance in the genome. Thus, these observations imply that caution should be taken in extrapolating results from studies on SSR mutability and on SSR phylogenetic comparisons that do not take into account the stratification of dinucelotide populations in the eukaryotic genome.  相似文献   

7.

Background

Birds have smaller average genome sizes than other tetrapod classes, and it has been proposed that a relatively low frequency of repeating DNA is one factor in reduction of avian genome sizes.

Results

DNA repeat arrays in the sequenced portion of the chicken (Gallus gallus) autosomes were quantified and compared with those in human autosomes. In the chicken 10.3% of the genome was occupied by DNA repeats, in contrast to 44.9% in human. In the chicken, the percentage of a chromosome occupied by repeats was positively correlated with chromosome length, but even the largest chicken chromosomes had repeat densities much lower than those in human, indicating that avoidance of repeats in the chicken is not confined to minichromosomes. When 294 simple sequence repeat types shared between chicken and human genomes were compared, mean repeat array length and maximum repeat array length were significantly lower in the chicken than in human.

Conclusions

The fact that the chicken simple sequence repeat arrays were consistently smaller than arrays of the same type in human is evidence that the reduction in repeat array length in the chicken has involved numerous independent evolutionary events. This implies that reduction of DNA repeats in birds is the result of adaptive evolution. Reduction of DNA repeats on minichromosomes may be an adaptation to permit chiasma formation and alignment of small chromosomes. However, the fact that repeat array lengths are consistently reduced on the largest chicken chromosomes supports the hypothesis that other selective factors are at work, presumably related to the reduction of cell size and consequent advantages for the energetic demands of flight.  相似文献   

8.
A sequence search of swine expressed sequence tags (EST) data in GenBank identified over 100 sequence files which contained a microsatellite repeat or simple sequence repeat (SSR). Most of these repeat motifs were dinucleotide (CA/GT) repeats; however, a number of tri-, tetra-, penta- and hexa-nucleotide repeats were also detected. An initial assessment of six dinucleotide and 14 higher-order repeat markers indicated that only dinucleotide markers yielded a sufficient number of informative markers (100% vs. 14% for dinucleotide and higher order repeats, respectively). Primers were designed for an additional 50 di- and one tri-nucleotide SSRs. Overall, 42 markers were polymorphic in the US Meat Animal Research Center (MARC) reference population, 17 markers were uninformative and 12 primer pairs failed to satisfactorily amplify genomic DNA. A comparison of di-nucleotide repeat vs. markers with repeat motifs of three to six bases demonstrated that 72% of dinucleotide markers were informative relative to only 7% of other repeat motifs. The difference was the result of a much higher percentage of monomorphic markers in the three to six base repeat motif markers than in the dinucleotide markers (64% vs. 14%). Either higher order repeat motifs are less polymorphic in the porcine genome or our selection criteria for repeat length of more than 17 contiguous bases was too low. The mapped microsatellite markers add to the porcine genetic map and provide valuable links between the porcine and human genome.  相似文献   

9.
In order to study the mechanisms for the generation of length diversity within the 5' flanking region of the human insulin gene, we have isolated and sequenced a previously uncharacterized allele. This allele, of a size intermediate between those three already described in the literature, encompasses 1,156 base pairs (bp) and contains 81 reiterated tandem oligonucleotides of 14-15 bp each. Population analysis on 298 independently sampled individuals by Southern blotting of genomic DNA demonstrates that the polymorphic portion of the insulin 5' flanking region varies from 400 to more than 8,000 nucleotides, being encoded by from 30 to over 540 oligomeric repeats. Length variability 5' to the insulin gene is a result primarily of unequal crossing over, which generates an expansion or contraction in the number of tandem repeat units per chromosome. A similar mechanism probably accounts for nondispersed reiterated sequences at other loci in the human genome.  相似文献   

10.
Novel functional role of CA repeats and hnRNP L in RNA stability   总被引:6,自引:1,他引:5  
CA dinucleotide repeat sequences are very common in the human genome. We have recently demonstrated that the polymorphic CA repeats in intron 13 of the human endothelial nitric oxide synthase (eNOS) gene function as an unusual, length-dependent splicing enhancer. The CA repeat enhancer requires for its activity specific binding of hnRNP L. Here we show that in the absence of bound hnRNP L, the pre-mRNA is cleaved directly upstream of the CA repeats. The addition of recombinant hnRNP L restores RNA stability. CA repeats are both necessary and sufficient for this specific cleavage in the 5' adjacent RNA sequence. We conclude that-in addition to its role as a splicing activator-hnRNP L can act in vitro as a sequence-specific RNA protection factor. Based on the wide abundance of CA repetitive sequences in the human genome, this may represent a novel, generally important role of this abundant hnRNP protein.  相似文献   

11.
Interactions between the termini of adeno-associated virus DNA   总被引:10,自引:0,他引:10  
  相似文献   

12.
Expansion of GAA repeats in the intron of the frataxin gene is involved in the autosomal recessive Friedreich's ataxia (FRDA). The GAA repeats arise from a stretch of adenine residues of an Alu element. These repeats have a size ranging from 7- 38 in the normal population, and expand to thousands in the affected individuals. The mechanism of origin of GAA repeats, their polymorphism and stability are not well understood. In this study, we have carried out an extensive analysis of GAA repeats at several loci in the humans. This analysis indicates the association of a majority of GAA repeats with the 3' end of an "A" stretch present in the Alu repeats. Further, the prevalence of GAA repeats correlates with the evolutionary age of Alu subfamilies as well as with their relative frequency in the genome. Our study on GAA repeat polymorphism at some loci in the normal population reveals that the length of the GAA repeats is determined by the relative length of the flanking A stretch. Based on these observations, a possible mechanism for origin of GAA repeats and modulatory effects of flanking sequences on repeat instability mediated by DNA triplex is proposed.  相似文献   

13.
The human alpha-fetoprotein gene spans 19,489 base pairs from the putative "Cap" site to the polyadenylation site. It is composed of 15 exons separated by 14 introns, which are symmetrically placed within the three domains of alpha-fetoprotein. In the 5' region, a putative TATAAA box is at position -21, and a variant sequence, CCAAC, of the common CAT box is at -65. Enhancer core sequences GTGGTTTAAAG are found in introns 3 and 4, and several copies of glucocorticoid response sequences AGATACAGTA are found on the template strand of the gene. There are six polymorphic sites within 4690 base pairs of contiguous DNA derived from two allelic alpha-fetoprotein genes. This amounts to a measured polymorphic frequency of 0.13%, or 6.4 X 10(-4)/site, which is about 5-10 times lower than values estimated from studies on polymorphic restriction sites in other regions of the human genome. There are four types of repetitive sequence elements in the introns and flanking regions of the human alpha-fetoprotein gene. At least one of these is apparently a novel structure (designated Xba) and is found as a pair of direct repeats, with one copy in intron 7 and the other in intron 8. It is conceivable that within the last 2 million years the copy in intron 8 gave rise to the repeat in intron 7. Their present location on both sides of exon 8 gives these sequences a potential for disrupting the functional integrity of the gene in the event of an unequal crossover between them. There are three Alu elements, one of which is in intron 4; the others are located in the 3' flanking region. A solitary Kpn repeat is found in intron 3. The Xba and Kpn repeats were only detected by complete sequencing of the introns. Neither X, Xba, nor Kpn elements are present in the related human albumin gene, whereas Alu's are present in different positions. From phylogenetic evidence, it appears that Alu elements were inserted into the alpha-fetoprotein gene at some time postdating the mammalian radiation 85 million years ago.  相似文献   

14.
Huntington disease (HD) is an autosomal dominant degenerative disorder caused by an expanded and unstable trinucleotide repeat (CAG)n in a gene (IT-15) on chromosome 4. HD exhibits genetic anticipation—earlier onset in successive generations within a pedigree. From a population-based clinical sample, we ascertained parent-offspring pairs with expanded alleles, to examine the intergenerational behavior of the trinucleotide repeat and its relationship to anticipation. We find that the change in repeat length with paternal transmission is significantly correlated with the change in age at onset between the father and offspring. When expanded triplet repeats of affected parents are separated by median repeat length, we find that the longer paternal and maternal repeats are both more unstable on transmission. However, unlike in paternal transmission, in which longer expanded repeats display greater net expansion than do shorter expanded repeats, in maternal transmission there is no mean change in repeat length for either longer or shorter expanded repeats. We also confirmed the inverse relationship between repeat length and age at onset, the higher frequency of juvenile-onset cases arising from paternal transmission, anticipation as a phenomenon of paternal transmission, and greater expansion of the trinucleotide repeat with paternal transmission. Stepwise multiple regression indicates that, in addition to repeat length of offspring, age at onset of affected parent and sex of affected parent contribute significantly to the variance in age at onset of the offspring. Thus, in addition to triplet repeat length, other factors, which could act as environmental factors, genetic factors, or both, contribute to age at onset. Our data establish that further expansion of paternal repeats within the affected range provides a biological basis of anticipation in HD.  相似文献   

15.
16.
17.
The current pace of the generation of sequence data requires the development of software tools that can rapidly provide full annotation of the data. We have developed a new method for rapid sequence comparison using the exact match algorithm without repeat masking. As a demonstration, we have identified all perfect simple tandem repeats (STR) within the draft sequence of the human genome. The STR elements (chromosome, position, length and repeat subunit) have been placed into a relational database. Repeat flanking sequence is also publicly accessible at http://grid.abcc.ncifcrf.gov. To illustrate the utility of this complete set of STR elements, we documented the increased density of potentially polymorphic markers throughout the genome. The new STR markers may be useful in disease association studies because so many STR elements manifest multiallelic polymorphism. Also, because triplet repeat expansions are important for human disease etiology, we identified trinucleotide repeats that exist within exons of known genes. This resulted in a list that includes all 14 genes known to undergo polynucleotide expansion, and 48 additional candidates. Several of these are non-polyglutamine triplet repeats. Other examinations of the STR database demonstrated repeats spanning splice junctions and identified SNPs within repeat elements.  相似文献   

18.
The complete nucleotide sequence of the human apolipoprotein All gene together with 911 bases of 5' flanking sequence and 687 bases of 3' flanking sequence have been determined. The mRNA coding region is interrupted by three introns of 169, 293 and 395bp. The Intro-exon structure of the apo All gene is similar to that of the apo AI, apo CIII and apo E genes: three introns separate 4 coding sequences specifying the 5' untranslated region, pre-peptide, a short N-terminal domain and a C-terminal domain composed of a variable number of lipid-binding amphipathic helices. Intron II carries a 33bp dG-dT repetitive element adjacent to the 3' splice junction which has the potential to adopt the Z-DNA conformation. The 5' and 3' terminuses of the mRNA have been identified by primer extension and S1 nuclease mapping. A number of short direct repeats are found in the 5' flanking region and an inverted repeat occurs between the CAAT and TATA boxes. Downstream of the the gene is an Alu family repeat containing a polymorphic MspI site, the deletion of which is associated with increased circulating levels of apoAII. ApoAII gene expression was demonstrated in adult human liver and HepG2 cells but not in human small intestine. Of ten Rhesus monkey tissues examined apo All mRNA was detected only in liver.  相似文献   

19.
A new family of repeats--i.e. MB1 repeats family--the number of copies of which per a human genome constitutes a few hundreds of thousands of copies has been revealed in a human gemone by computer analysis of a noncanonical similarity of nucleic acid sequences. The numbers of that family of repeats have also been revealed in the genomes of mouse and rat, they have been identified as mirror--reflected copies--in purines and pyrimidines--of B1 repeats in the genome of mouse and the Alu repeats in the human genome. The MB1 repeats tend to remain most similar at a length of 70 b.p. They are not flanked by short repeats, neither contain poly(A) region at the 3' end, by which they differ from the repeats of the SINE family. It has been assumed that the member of the Alu repeats family and the MB1 repeats family can form a so called H-form of DNA. The mirror-reflected repeat family could have been formed by replication of parallel DNA strands.  相似文献   

20.
To determine the frequency and clustering of a variety of simple di-and trinucleotide repeats, an Artiodactyl short interspersed element (SINE), an ovine satellite repeat, and a human Alu 1 repeat were used to screen a random selection of cosmids containing inserts of ovine genomic DNA. In total, 197 individual cosmids were digested with EcoRI and the fragments separated on 0.7% agarose gels. Southern blots of these gels were then sequentially probed with (AC)7, (CT)9, and (CAC)6 oligonucleotides, and the repeats described above. The frequency at which (AC)1, (CT)n, and (CAC)n repeats were found in the cosmids indicated that they occurred at average intervals of 65 kb, 367 kb, and 213 kb respectively within the ovine genome. The Artiodactyl SINE was the most common, occurring at an average interval of 20 kb. No human Alu 1 sequences were detected. There was a significant positive association between the (AC)n and the Artiodactyl SINE. This association is quite strong as there was significant clustering of the two repeats both within cosmids and also within the EcoRI fragments of the digested genomic fragments. With the exception of the sheep satellite sequence, which occurs in tandem arrays, none of the other repeats showed significant clustering within the 41-kb (average size) cosmid inserts. The first 25 ovine microsatellites we characterized had an average polymorphic information content (PIC) of 0.65. The different microsatellite types, containing either perfect, imperfect, or compound repeats, had similar average PICs of 0.64, 0.65, and 0.66 respectively. There was a weak regression relationship (R2(adj)%=21.9) between the length of the longest uninterrupted dinucleotide repeat in the largest allele and the PIC of the microsatellite.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号