首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We have developed a method of targeted genomic difference analysis (TGDA) for genomewide detection of interspersed repeat integration site differences between closely related genomes. The method includes a whole-genome amplification of the flanks adjacent to target interspersed repetitive elements in both genomic DNAs under comparison, and subtractive hybridization (SH) of the selected amplicons. The potential of TGDA was demonstrated by the detection of differences in the integration sites of human endogenous retroviruses K (HERV-K) and related solitary long terminal repeats (LTRs) between the human and chimpanzee genomes. Of 55 randomly sequenced clones from a library enriched with human-specific integration (HSI) sites, 33 (60%) represented HSIs. All the human-specific (Hs) LTRs belong to two related evolutionarily young groups, suggesting simultaneous activity of two master genes in the hominid lineage. No deletion/insertion polymorphism was detected for the LTR HSIs for 25 unrelated caucasoid individuals. We also discuss the possible research applications for TGDA research.  相似文献   

2.
Structural instability has been frequently observed in natural plasmids and vectors used for protein expression or DNA vaccine development. However, there is a lack of information concerning hotspot mapping, namely, DNA repeats or sequences identical to the host genome. This led us to evaluate the abundance and distribution of direct, inverted, and tandem repeats with high recombination potential in 36 natural plasmids from ten bacterial genera, as well as in several widely used bacterial and mammalian expression vectors. In natural plasmids, we observed an overrepresentation of close direct repeats in comparison to inverted ones and a preferential location of repeats with high recombination potential in intergenic regions, suggesting a highly plastic and dynamic behavior. In plasmid vectors, we found a high density of repeats within eukaryotic promoters and non-coding sequences. As a result of this in silico analysis, we detected a spontaneous recombination between two 21-bp direct repeats present in the human cytomegalovirus early enhancer/promoter (huCMV EEP) of the pCIneo plasmid. This finding is of particular importance, as the huCMV EEP is one of the most frequently used regulatory elements in plasmid vectors. Because pDNA integration into host gDNA can have adverse consequences in terms of plasmid processing and host safety, we also mapped several regions with high probability to mediate integration into the Escherichia coli or human genomes. Like repeated regions, some of these were located in non-coding regions of the plasmids, thus being preferential targets to be removed.  相似文献   

3.
Alu elements undergo amplification through retroposition and integration into new locations throughout primate genomes. Over 500,000 Alu elements reside in the human genome, making the identification of newly inserted Alu repeats the genomic equivalent of finding needles in the haystack. Here, we present two complementary methods for rapid detection of newly integrated Alu elements. In the first approach we employ computational biology to mine the human genomic DNA sequence databases in order to identify recently integrated Alu elements. The second method is based on an anchor-PCR technique which we term Allele-Specific Alu PCR (ASAP). In this approach, Alu elements are selectively amplified from anchored DNA generating a display or 'fingerprint' of recently integrated Alu elements. Alu insertion polymorphisms are then detected by comparison of the DNA fingerprints generated from different samples. Here, we explore the utility of these methods by applying them to the identification of members of the smallest previously identified subfamily of Alu repeats in the human genome termed Ya8. This subfamily of Alu repeats is composed of about 50 elements within the human genome. Approximately 50% of the Ya8 Alu family members have inserted in the human genome so recently that they are polymorphic, making them useful markers for the study of human evolution. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

4.
Members of MB1 family repeats are revealed in genomes of many mammals (cow, rabbit, opossum, horse, ...). The MB1 repeats from cow and rabbit genomes are mirror-reflected about the SINE families repeats from cow and rabbit genomes. The life time of MB1 repeats are no less than 100 million years. Classification of MB1 repeats from human genome using the information similarity was performed. This classification has revealed two subfamily MB1 repeats in human genome. Possible processes of creation of MB1 family repeats common for many mammals are discussed.  相似文献   

5.
The genomes of birds are much smaller than mammalian genomes, and transposable elements (TEs) make up only 10% of the chicken genome, compared with the 45% of the human genome. To study the mechanisms that constrain the copy numbers of TEs, and as a consequence the genome size of birds, we analyzed the distributions of LINEs (CR1's) and SINEs (MIRs) on the chicken autosomes and Z chromosome. We show that (1) CR1 repeats are longest on the Z chromosome and their length is negatively correlated with the local GC content; (2) the decay of CR1 elements is highly biased, and the 5'-ends of the insertions are lost much faster than their 3'-ends; (3) the GC distribution of CR1 repeats shows a bimodal pattern with repeats enriched in both AT-rich and GC-rich regions of the genome, but the CR1 families show large differences in their GC distribution; and (4) the few MIRs in the chicken are most abundant in regions with intermediate GC content. Our results indicate that the primary mechanism that removes repeats from the chicken genome is ectopic exchange and that the low abundance of repeats in avian genomes is likely to be the consequence of their high recombination rates.  相似文献   

6.
7.
Development of cervical cancer is directly associated with integration of human papillomavirus (HPV) genomes into host chromosomes and subsequent modulation of HPV oncogene expression, which correlates with multi-layered epigenetic changes at the integrated HPV genomes. However, the process of integration itself and dysregulation of host gene expression at sites of integration in our model of HPV16 integrant clone natural selection has remained enigmatic. We now show, using a state-of-the-art ‘HPV integrated site capture’ (HISC) technique, that integration likely occurs through microhomology-mediated repair (MHMR) mechanisms via either a direct process, resulting in host sequence deletion (in our case, partially homozygously) or via a ‘looping’ mechanism by which flanking host regions become amplified. Furthermore, using our ‘HPV16-specific Region Capture Hi-C’ technique, we have determined that chromatin interactions between the integrated virus genome and host chromosomes, both at short- (<500 kbp) and long-range (>500 kbp), appear to drive local host gene dysregulation through the disruption of host:host interactions within (but not exceeding) host structures known as topologically associating domains (TADs). This mechanism of HPV-induced host gene expression modulation indicates that integration of virus genomes near to or within a ‘cancer-causing gene’ is not essential to influence their expression and that these modifications to genome interactions could have a major role in selection of HPV integrants at the early stage of cervical neoplastic progression.  相似文献   

8.
Adeno-associated virus vector integration junctions.   总被引:5,自引:4,他引:1       下载免费PDF全文
Vectors derived from adeno-associated virus (AAV) have the potential to stably transduce mammalian cells by integrating into host chromosomes. Despite active research on the use of AAV vectors for gene therapy, the structure of integrated vector proviruses has not previously been analyzed at the DNA sequence level. Studies on the integration of wild-type AAV have identified a common site-specific integration locus on human chromosome 19; however, most AAV vectors do not appear to integrate at this locus. To improve our understanding of AAV vector integration, we analyzed the DNA sequences of several integrated vector proviruses. HeLa cells were transduced with an AAV shuttle vector, and integrated proviruses containing flanking human DNA were recovered as bacterial plasmids for further analysis. We found that AAV vectors integrated as single-copy proviruses at random chromosomal locations and that the flanking HeLa DNA at integration sites was not homologous to AAV or the site-specific integration locus of wild-type AAV. Recombination junctions were scattered throughout the vector terminal repeats with no apparent site specificity. None of the integrated vectors were fully intact. Vector proviruses with nearly intact terminal repeats were excised and amplified after infection with wild-type AAV and adenovirus. Our results suggest that AAV vectors integrate by nonhomologous recombination after partial degradation of entering vector genomes. These findings have important implications for the mechanism of AAV vector integration and the use of these vectors in human gene therapy.  相似文献   

9.
L Cui  B A Webb 《Journal of virology》1997,71(11):8504-8513
Polydnaviruses (PDVs) are double-stranded DNA viruses with segmented genomes that replicate only in the oviducts of some species of parasitic wasps and are required for the successful parasitization of lepidopteran insects. PDV DNA segments are integrated in the genomes of their associated wasp hosts, and some are nested; i.e., smaller segments are produced from and largely colinear with larger segments. To determine the internal structure of nested viral segments, the first complete nucleotide sequence of a PDV genome segment and its integration locus was determined. By restriction mapping, Southern blot, and sequence analyses, we demonstrated that the Campoletis sonorensis PDV segment W is integrated into wasp genomic DNA. DNA sequence analysis revealed that proviral segment W terminates in two 1,185-bp direct long terminal repeats (LTRs) in the wasp chromosome, while only one LTR copy is present in the extrachromosomal (viral) W. The results suggest that terminal direct repeats are a general feature of PDV DNA segment integration but that the homology and size of the repeats can vary extensively. Segment W contains 12 imperfect direct repeats of six different types between 89 bp and 1.9 kbp with 65 to 90% homology. The orientation and structure of the repeats suggest that W itself may have arisen through sequence duplication and subsequent divergence. Mapping, hybridization, and sequence analyses of cloned R and M demonstrated that these segments are nested within segment W and that internal imperfect direct repeats of one type are implicated in the homologous intramolecular recombination events that generate segments R and M. Interestingly, segment nesting differentially increases the copy number of genes encoded by segment W, suggesting that the unusual genomic organization of PDVs may be directly linked to the unique functions of this virus in its obligate mutualistic association with parasitic wasps.  相似文献   

10.
We study the length distribution functions for the 16 possible distinct dimeric tandem repeats in DNA sequences of diverse taxonomic partitions of GenBank (known human and mouse genomes, and complete genomes of Caenorhabditis elegans and yeast). For coding DNA, we find that all 16 distribution functions are exponential. For non-coding DNA, the distribution functions for most of the dimeric repeats have surprisingly long tails, that fit a power-law function. We hypothesize that: (i) the exponential distributions of dimeric repeats in protein coding sequences indicate strong evolutionary pressure against tandem repeat expansion in coding DNA sequences; and (ii) long tails in the distributions of dimers in non-coding DNA may be a result of various mutational mechanisms. These long, non-exponential tails in the distribution of dimeric repeats in non-coding DNA are hypothesized to be due to the higher tolerance of non-coding DNA to mutations. By comparing genomes of various phylogenetic types of organisms, we find that the shapes of the distributions are not universal, but rather depend on the specific class of species and the type of a dimer.  相似文献   

11.
本文以人腺病毒B亚种31条基因组序列及D亚种39条基因组序列为研究材料,利用ImperfectMicrosatelliteExtractor和DNAMAN软件对这些基因组序列中简单重复序列(SSR)的分布情况进行了系统性分析和比较。分析结果显示:人腺病毒B、D亚种基因组中简单重复序列的平均相对密度是十分接近的,但在不同类型SSR中分布情况又有所不同。D亚种中二型SSR明显高于B亚种,在两亚种一型SSR中(A)n、(T)n都是比较多的,而在两亚种二型SSR中的(CG/GC)n表现出了较高的偏好性。在同亚种多序列比对分析中,D亚种表现出了更高的稳定性。B、D亚种中SSR的这种特异性分布可能与它们的进化机制和致病性有关。  相似文献   

12.
RepeatAround is a Windows based software tool designed to find "direct repeats", "inverted repeats", "mirror repeats" and "complementary repeats", from 3 to 64bp length, in circular genomes. It processes input files directly extracted from GenBank database, providing visualisation of the repeats location in the genomic structure, so that for instance, in most mtDNAs the user can check if the repeats are located in coding or non-coding region (and in the first case in which gene), and how far apart the repeat pair(s) are. Besides the visual tool, it provides other outputs in a spreadsheet containing information on the number and location of the repeats, facilitating graphic analyses. Several genomes can be inputed simultaneously, for phylogenetic comparison purposes. Other capabilities of the software are the generation of random circular genomes, for statistical evaluation of comparison between observed repeats distributions with their shuffled counterparts, as well as the search for specific motifs, allowing an easy confirmation of repeats flanking a newly detected rearrangement. As an example of the programme's applications we analysed the Direct Repeats distribution in a large human mtDNA database. Results showed that Direct Repeats, even the larger ones, are evenly distributed among the human mtDNA haplogroups, enabling us to state that, based only on the repetitive motifs, no haplogroup is particularly more or less prone to mtDNA macrodeletions.  相似文献   

13.
Prokaryotic genomes seem to be optimized toward compactness and have therefore been thought to lack long redundant DNA sequences. However, we identified a large number of long strict repeats in eight prokaryotic complete genomes and found that their density is negatively correlated with genome size. A detailed analysis of the long repeats present in the genome of Bacillus subtilis revealed a very strict constraint on the spatial distribution of repeats in this genome. We interpret this as the hallmark of selection processes leading to the addition of new genetic information. Such addition is independent of insertion sequences and relies on the nonspecific DNA uptake by the competent cell and its subsequent integration in the chromosome in a circular form through a Campbell-like mechanism. Similar patterns are found in other competent genomes of Gram-negative bacteria and Archaea, suggesting a similar evolutionary mechanism. The correlation of the spatial distribution of repeats and the absence of insertion sequences in a genome may indicate, in the framework of our model, that mechanisms aiming at their avoidance/elimination have been developed.  相似文献   

14.
A new experimental technique for genome-wide detection of integration sites of polymorphic retroelements (REs) is described. The technique allows one to reveal the absence of a retroelement in an individual genome provided that this retroelement is present in at least one of several other genomes under comparison. Since quite a number of genomes are compared simultaneously, the search for polymorphic REs insertions is very efficient. The technique includes two whole-genome selective PCR amplifications of sequences flanking REs: one for a particular genome and another one for a mixture of ten different genomes. A subsequent subtractive hybridization of the obtained amplicons with DNA of a particular genome as driver results in isolation of polymorphic insertions. The technique was successfully applied for identification of 41 new polymorphic human AluYa5/Ya8 insertions. Among them, 18 individual Alu elements first sequenced in this work were not found in the available human genome databases. This result suggests that significant part of polymorphic REs were not identified during genome sequencing and remain to be detected and characterized. The proposed method does not depend on preliminary knowledge of evolutionary history of retroelements and can be applied for identification of insertion/deletion polymorphic markers in genomes of different species.  相似文献   

15.
Integrated retroviral genomes are flanked by direct repeats of sequences derived from the termini of the viral RNA genome. These sequences are designated long terminal repeats (LTRs). We have determined and analyzed the nucleotide sequence of the LTRs from several exogenous and endogenous avian retroviruses. These LTRs possess several structural similarities with eukaryotic and prokaryotic transposable elements: 1) inverted complementary repeats at the termini, 2) deletions of sequences adjacent to the LTR, 3) small duplications of host sequences flanking the integrated provirus, and 4) sequence homologies with transposable and other genetic elements. These observations suggest that LTRs function in the integration and perhaps transposition of retrovirus genomes. Evidence exists for the presence of a strong promoter sequence within the LTR. The retroviral LTR also contains a "Hogness box" up-stream of the capping site and a poly(A) signal. These features suggest an additional role for the LTR in the regulation of gene expression.  相似文献   

16.
Based on published information, we have identified 991 genes and gene-family clusters for cattle and 764 for pigs that have orthologues in the human genome. The relative linear locations of these genes on human sequence maps were used as "rulers" to annotate bovine and porcine genomes based on a CSAM (contiguous sets of autosomal markers) approach. A CSAM is an uninterrupted set of markers in one genome (primary genome; the human genome in this study) that is syntenic in the other genome (secondary genome; the bovine and porcine genomes in this study). The analysis revealed 81 conserved syntenies and 161 CSAMs between human and bovine autosomes and 50 conserved syntenies and 95 CSAMs between human and porcine autosomes. Using the human sequence map as a reference, these 991 and 764 markers could correlate 72 and 74% of the human genome with the bovine and porcine genomes, respectively. Based on the number of contiguous markers in each CSAM, we classified these CSAMs into five size groups as follows: singletons (one marker only), small (2-4 markers), medium (5-10 markers), large (11-20 markers), and very large (> 20 markers). Several bovine and porcine chromosomes appear to be represented as di-CSAM repeats in a tandem or dispersed way on human chromosomes. The number of potential CSAMs for which no markers are currently available were estimated to be 63 between human and bovine genomes and 18 between human and porcine genomes. These results provide basic guidelines for further gene and QTL mapping of the bovine and porcine genomes, as well as insight into the evolution of mammalian genomes.  相似文献   

17.
Exact Tandem Repeats Analyzer 1.0 (E-TRA) combines sequence motif searches with keywords such as ‘organs’, ‘tissues’, ‘cell lines’ and ‘development stages’ for finding simple exact tandem repeats as well as non-simple repeats. E-TRA has several advanced repeat search parameters/options compared to other repeat finder programs as it not only accepts GenBank, FASTA and expressed sequence tags (EST) sequence files, but also does analysis of multiple files with multiple sequences. The minimum and maximum tandem repeat motif lengths that E-TRA finds vary from one to one thousand. Advanced user defined parameters/options let the researchers use different minimum motif repeats search criteria for varying motif lengths simultaneously. One of the most interesting features of genomes is the presence of relatively short tandem repeats (TRs). These repeated DNA sequences are found in both prokaryotes and eukaryotes, distributed almost at random throughout the genome. Some of the tandem repeats play important roles in the regulation of gene expression whereas others do not have any known biological function as yet. Nevertheless, they have proven to be very beneficial in DNA profiling and genetic linkage analysis studies. To demonstrate the use of E-TRA, we used 5,465,605 human EST sequences derived from 18,814,550 GenBank EST sequences. Our results indicated that 12.44% (679,800) of the human EST sequences contained simple and non-simple repeat string patterns varying from one to 126 nucleotides in length. The results also revealed that human organs, tissues, cell lines and different developmental stages differed in number of repeats as well as repeat composition, indicating that the distribution of expressed tandem repeats among tissues or organs are not random, thus differing from the un-transcribed repeats found in genomes.  相似文献   

18.
We present a fast algorithm to search for repeating fragments within protein sequences. The technique is based on an extension of the Smith-Waterman algorithm that allows the calculation of sub-optimal alignments of a sequence against itself. We are able to estimate the statistical significance of all sub-optimal alignment scores. We also rapidly determine the length of the repeating fragment and the number of times it is found in a sequence. The technique is applied to sequences in the Swissprot database, and to 16 complete genomes. We find that eukaryotic proteins contain more internal repeats than those of prokaryotic and archael organisms. The finding that 18% of yeast sequences and 28% of the known human sequences contain detectable repeats emphasizes the importance of internal duplication in protein evolution.  相似文献   

19.
Koressaar T  Remm M 《DNA research》2012,19(3):219-230
Prokaryotes are in general believed to possess small, compactly organized genomes, with repetitive sequences forming only a small part of them. Nonetheless, many prokaryotic genomes in fact contain species-specific repeats (>85 bp long genomic sequences with less than 60% identity to other species) as we have previously demonstrated. However, it is not known at present how frequent such species-specific repeats are and what their functional roles in bacterial genomes may be. Therefore, we have conducted a comprehensive survey of prokaryotic species-specific repeats and characterized them to examine as to whether there are functional classes among different repeats or not and how they are mutually related to each other. Of the 613 distinct prokaryotic species analyzed, 97% were found to contain at least one species-specific repeats. It seems interesting to note that the species-specific repeats thus identified appear to be functionally variable in different genomes: in some genomes, they are mostly associated with duplicated protein-coding genes, whereas in some other genomes with rRNA and tRNA genes. Contrary to what may be expected, only one-fourth of the species-specific repeats were found to be associated with mobile genetic elements.  相似文献   

20.
Telomer repeats represented by hexamer (TTAGGG)n at chromosome termini are required for correct function and chromosome stability. At the same time, interstitial telomer sequence (ITS) located far from the chromosome ends are known for several mammalian genomes, including the human genome. It is assumed that these repeats mark the points of fusion or other chromosome reconstructions of ancestors. Exact localization of all interstitial telomer sequences in the genome could greatly improve our understanding of the mechanism of karyotype evolution and species origin. We have developed a software for a search of interstitial telomer sequences in complete sequences of mammalian genomes. We have demonstrated the evolutionary significance of repeats by an example of human chromosome 2. The results and supplementary materials are available at the site of the Institute of Cytology and Genetics: http://www.bionet.nsc.ru/labs/theorylabmain/orlov/telomere/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号