共查询到20条相似文献,搜索用时 15 毫秒
1.
Repetitive sequences in the crocodilian mitochondrial control region: poly-A sequences and heteroplasmic tandem repeats 总被引:3,自引:0,他引:3
Heteroplasmic tandem repeats in the mitochondrial control region have been documented in a wide variety of vertebrate species. We have examined the control region from 11 species in the family Crocodylidae and identified two different types of heteroplasmic repetitive sequences in the conserved sequence block (CSB) domain-an extensive poly-A tract that appears to be involved in the formation of secondary structure and a series of tandem repeats located downstream ranging from approximately 50 to approximately 80 bp in length. We describe this portion of the crocodylian control region in detail and focus on members of the family Crocodylidae. We then address the origins of the tandemly repeated sequences in this family and suggest hypotheses to explain possible mechanisms of expansion/contraction of the sequences. We have also examined control region sequences from Alligator and Caiman and offer hypotheses for the origin of tandem repeats found in those taxa. Finally, we present a brief analysis of intraindividual and interindividual haplotype variation by examining representatives of Morelet's crocodile (Crocodylus moreletii). 相似文献
2.
3.
4.
Identification and nucleotide sequences of two similar tandem direct repeats in Epstein-Barr virus DNA 总被引:5,自引:13,他引:5
下载免费PDF全文

Epstein-Barr virus DNA is known to have partially homologous segments, designated DL and DR, near the left and right ends of the long unique region (Raab-Traub et al., Cell 22:257-267, 1980). DL and DR are each partially composed of tandem direct repeat sequences. DL contains 11 to 14 repeats of a 124-base-pair sequence designated IR2. DR contains approximately 30 direct repeats of a 103-base-pair sequence designated IR4. The DL and DR sequences have colinear partial homology for approximately 2.4 and 1.5 kilobase pairs to the right of IR2 and IR4, respectively. IR2 and IR4 are similar sequences and evolved in part from a common ancestor. Both sequences are 84% guanine and cytosine and have limited homology to Epstein-Barr virus IR1 and to the herpes simplex virus type 1 inverted terminal repeat "a" sequence. IR2 encodes part of an abundant 2.5-kilobase persistent early EBV RNA expressed in productively infected cells, but does not encode part of the 3-kilobase Epstein-Barr virus RNA which is transcribed from the adjacent IR1-U2 region of the Epstein-Barr virus genome in latently infected cells. 相似文献
5.
The Tibetan macaque, which is endemic to China,is currently listed as a Near Endangered primate species by the International Union for Conservation of Nature(IUCN)(2017). Short tandem repeats(STRs) refer to repetitive elements of genome sequence that range in length from 1–6 bp. They are found in many organisms and are widely applied in population genetic studies. To clarify the distribution characteristics of genome-wide STRs and understand their variation among Tibetan macaques,we conducted a genome-wide survey of STRs with next-generation sequencing of five macaque samples.A total of 1077 790 perfect STRs were mined from our assembly, with an N50 of 4 966 bp. Mono-nucleotide repeats were the most abundant, followed by tetraand di-nucleotide repeats. Analysis of GC content and repeats showed consistent results with other macaques. Furthermore, using STR analysis software(lob STR), we found that the proportion of base pair deletions in the STRs was greater than that of insertions in the five Tibetan macaque individuals(P0.05, t-test). We also found a greater number of homozygous STRs than heterozygous STRs(P0.05,t-test), with the Emei and Jianyang Tibetan macaques showing more heterozygous loci than Huangshan Tibetan macaques. The proportion of insertions and mean variation of alleles in the Emei and Jianyang individuals were slightly higher than those in the Huangshan individuals, thus revealing differences in STR allele size between the two populations.The polymorphic STR loci identified based on the reference genome showed good amplification efficiency and could be used to study population genetics in Tibetan macaques. The neighbor-joining tree classified the five macaques into two different branches according to their geographical origin,indicating high genetic differentiation between the Huangshan and Sichuan populations. We elucidated the distribution characteristics of STRs in the Tibetan macaque genome and provided an effective method for screening polymorphic STRs. Our results also lay a foundation for future genetic variation studies of macaques. 相似文献
6.
Genomic DNA contains a wide variety of repetitive sequences. In Escherichia coli, there have been several classes of repetitive sequences reported, some of which cluster as tandem repeats. We propose a novel method for analyzing symbolic sequences by two-dimensional pattern formation with color-coding. We applied this method for searching tandem repeats in the E. coli genome and found approximately 50 repeats with periods longer than 30 bases. The longest repeat has a period of 1267 bases. 相似文献
7.
Finding approximate tandem repeats in genomic sequences. 总被引:1,自引:0,他引:1
Ydo Wexler Zohar Yakhini Yechezkel Kashi Dan Geiger 《Journal of computational biology》2005,12(7):928-942
An efficient algorithm is presented for detecting approximate tandem repeats in genomic sequences. The algorithm is based on a flexible statistical model which allows a wide range of definitions of approximate tandem repeats. The ideas and methods underlying the algorithm are described and its effectiveness on genomic data is demonstrated. 相似文献
8.
Repetitive extragenic palindromic sequences: a major component of the bacterial genome 总被引:50,自引:0,他引:50
We describe a remarkably conserved nucleotide sequence, the many copies of which may occupy up to 1% of the genomes of E. coli and S. typhimurium. This sequence, the REP (repetitive extragenic palindromic) sequence, is about 35 nucleotides long, includes an inverted repeat, and can occur singly or in multiple adjacent copies. A possible role for the REP sequences in regulation of gene expression has been thoroughly investigated. While the REP sequences do not appear to modulate differential gene expression within an operon, they can affect the expression of both upstream and downstream genes to a small extent, probably by affecting the rate of mRNA degradation. Possible roles for the REP sequence in mRNA degradation, chromosome structure, and recombination are discussed. 相似文献
9.
We study the length distribution functions for the 16 possible distinct dimeric tandem repeats in DNA sequences of diverse taxonomic partitions of GenBank (known human and mouse genomes, and complete genomes of Caenorhabditis elegans and yeast). For coding DNA, we find that all 16 distribution functions are exponential. For non-coding DNA, the distribution functions for most of the dimeric repeats have surprisingly long tails, that fit a power-law function. We hypothesize that: (i) the exponential distributions of dimeric repeats in protein coding sequences indicate strong evolutionary pressure against tandem repeat expansion in coding DNA sequences; and (ii) long tails in the distributions of dimers in non-coding DNA may be a result of various mutational mechanisms. These long, non-exponential tails in the distribution of dimeric repeats in non-coding DNA are hypothesized to be due to the higher tolerance of non-coding DNA to mutations. By comparing genomes of various phylogenetic types of organisms, we find that the shapes of the distributions are not universal, but rather depend on the specific class of species and the type of a dimer. 相似文献
10.
DNA from ground squirrels of the Citellus genus (Rodentia, Sciuridae) were analysed by centrifugation in the presence of CsCl followed by digestion by restriction endonucleases. Digestion of DNA of two species C. undulatus and C. fulvus by 10 of the 16 restriction endonucleases used led to formation of electrophoretically discrete fragments that are multiple to 330 b.p. in length which points out the tandem organization of repetitive sequences similar to the satellite DNA of many mammal species. However, upon centrifugation we failed to reveal a satellite band in these species; hence the tandem repeats refer to the class of cryptic satellites in the ground squirrels and do not differ in base composition from the remaining part of DNA. The main fraction of the genome was revealed in the form of discrete fragments by cleavage with HindIII and AluI. Both of these restriction endonucleases were used for comparative analysis of DNA of 12 Citellus species. It has been shown that DNA of all species can be digested by HindIII and yields a series of fragments that are multiple to 330-30 b.p. in length and the total content of which varies from species to species within 4-22%. The fraction of the tandem repeats does not correlate with the systematic position of species nor with the amount of heterochromatin in the chromosomes. AluI cuts the DNA of 11 species yielding 110 and 220 b.p. fragments compared to only 60 and 280 b.p. in the DNA of C. dauricus. Under HindIII digestion we can also reveal the tandem repeats in marmot, which is phylogenetically close to the Citellus of the Marmota genus, but they have another periodicity--180 b.p. We propose that the age of ground squirrels repeats is 2-3 million years and they are significantly younger than the marmot repeats. 相似文献
11.
The placozoan Trichoplax adhaerens has a compact genome with many primitive eumetazoan characteristics. In order to gain a better understanding of its genome architecture, we conducted a detailed analysis of repeat content in this genome. The transposable element (TE) content is lower than that of other metazoans, and the few TEs present in the genome appear to be inactive. A new phylogenetic clade of the gypsy-like LTR retrotransposons was identified, which includes the majority of gypsy-like elements in Trichoplax. A particular microsatellite motif (ACAGT) exhibits unexpectedly high abundance, and also has strong association with its nearby genes. 相似文献
12.
A unique group of large icosahedral viruses that infect a unicellular green alga (Chlorella sp. NC64A) were isolated from freshwater sources in Japan. These viruses contain a linear double-stranded DNA (dsDNA) genome with hairpin ends. A physical map was constructed for the genomic DNA of CVK1 (Chlorella virus isolated in Kyoto, no. 1) by pulsed-field gel electrophoresis of restriction fragments. The nucleotide sequences around both termini of the CVK1 DNA revealed the presence of inverted terminal repeats (ITR) of approximately 1.0 kb. Adjacent to the ITR, unique sequence elements of 10 to 20 by were directly repeated 20 to 30 times in tandem array. Several copies of these repeat elements were deleted in virus mutants that were occasionally generated from Chlorella cells that were in a putative CVK1 carrier state. These repeats might represent a hot spot of rearrangement in the CVK1 genome. 相似文献
13.
Short tandem repeats, specifically microsatellites, are widely used genetic markers, associated with human genetic diseases, and play an important role in various regulatory mechanisms and evolution. Despite their importance, much is yet unknown about their mutational dynamics. The increasing availability of genome data has led to several in silico studies of microsatellite evolution which have produced a vast range of algorithms and software for tandem repeat detection. Documentation of these tools is often sparse, or provided in a format that is impenetrable to most biologists without informatics background. This article introduces the major concepts behind repeat detecting software essential for informed tool selection. We reflect on issues such as parameter settings and program bias, as well as redundancy filtering and efficiency using examples from the currently available range of programs, to provide an integrated comparison and practical guide to microsatellite detecting programs. 相似文献
14.
Denis Mariat Béatrice De Gouyon Cécile Julier Mark Lathrop Gilles Vergnaud 《Mammalian genome》1993,4(3):135-140
Polymers of arbitrary oligonucleotides can be used to detect polymorphic loci in a wide range of vertebrate genomes. Using 60 such probes, we previously reported the selection of the most efficient STR probes for polymorphism detection in the set of genomes investigated. We now report the use of this selection for the mouse genome and its contribution to genetic mapping. Twenty-three synthetic tandem repeats (STRs) sequences were probed on a recombinant inbred panel C57B1/6 x DBA/2. The loci detected are distributed in 70 linkage groups; 42 of these groups, corresponding to about 100 different polymorphic loci, include reference markers. These linkage groups appear to be evenly distributed within all the 20 mouse chromosomes with apparently no bias of repartition towards telomeres or centromeres. 相似文献
15.
Background
Polymorphic tandem repeat typing is a new generic technology which has been proved to be very efficient for bacterial pathogens such as B. anthracis, M. tuberculosis, P. aeruginosa, L. pneumophila, Y. pestis. The previously developed tandem repeats database takes advantage of the release of genome sequence data for a growing number of bacteria to facilitate the identification of tandem repeats. The development of an assay then requires the evaluation of tandem repeat polymorphism on well-selected sets of isolates. In the case of major human pathogens, such as S. aureus, more than one strain is being sequenced, so that tandem repeats most likely to be polymorphic can now be selected in silico based on genome sequence comparison.Results
In addition to the previously described general Tandem Repeats Database, we have developed a tool to automatically identify tandem repeats of a different length in the genome sequence of two (or more) closely related bacterial strains. Genome comparisons are pre-computed. The results of the comparisons are parsed in a database, which can be conveniently queried over the internet according to criteria of practical value, including repeat unit length, predicted size difference, etc. Comparisons are available for 16 bacterial species, and the orthopox viruses, including the variola virus and three of its close neighbors.Conclusions
We are presenting an internet-based resource to help develop and perform tandem repeats based bacterial strain typing. The tools accessible at http://minisatellites.u-psud.fr now comprise four parts. The Tandem Repeats Database enables the identification of tandem repeats across entire genomes. The Strain Comparison Page identifies tandem repeats differing between different genome sequences from the same species. The "Blast in the Tandem Repeats Database" facilitates the search for a known tandem repeat and the prediction of amplification product sizes. The "Bacterial Genotyping Page" is a service for strain identification at the subspecies level.16.
A unique group of large icosahedral viruses that infect a unicellular green alga (Chlorella sp. NC64A) were isolated from freshwater sources in Japan. These viruses contain a linear double-stranded DNA (dsDNA) genome with hairpin ends. A physical map was constructed for the genomic DNA of CVK1 (Chlorella virus isolated in Kyoto, no. 1) by pulsed-field gel electrophoresis of restriction fragments. The nucleotide sequences around both termini of the CVK1 DNA revealed the presence of inverted terminal repeats (ITR) of approximately 1.0 kb. Adjacent to the ITR, unique sequence elements of 10 to 20 by were directly repeated 20 to 30 times in tandem array. Several copies of these repeat elements were deleted in virus mutants that were occasionally generated from Chlorella cells that were in a putative CVK1 carrier state. These repeats might represent a hot spot of rearrangement in the CVK1 genome. 相似文献
17.
MOTIVATION: Tandem repeats are associated with disease genes, play an important role in evolution and are important in genomic organization and function. Although much research has been done on short perfect patterns of repeats, there has been less focus on imperfect repeats. Thus, there is an acute need for a tandem repeats database that provides reliable and up to date information on both perfect and imperfect tandem repeats in the human genome and relates these to disease genes. RESULTS: This paper presents a web-accessible relational tandem repeats database that relates tandem repeats to gene locations and disease genes of the human genome. In contrast to other available databases, this database identifies both perfect and imperfect repeats of 1-2000 bp unit lengths. The utility of this database has been illustrated by analysing these repeats for their distribution and frequencies across chromosomes and genomic locations and between protein-coding and non-coding regions. The applicability of this database to identify diseases associated with previously uncharacterized tandem repeats is demonstrated. 相似文献
18.
Tandem repeats occur frequently in biological sequences. They are important for studying genome evolution and human disease. A number of methods have been designed to detect a single tandem repeat in a sliding window. In this article, we focus on the case that an unknown number of tandem repeat segments of the same pattern are dispersively distributed in a sequence. We construct a probabilistic generative model for the tandem repeats, where the sequence pattern is represented by a motif matrix. A Bayesian approach is adopted to compute this model. Markov chain Monte Carlo (MCMC) algorithms are used to explore the posterior distribution as an effort to infer both the motif matrix of tandem repeats and the location of repeat segments. Reversible jump Markov chain Monte Carlo (RJMCMC) algorithms are used to address the transdimensional model selection problem raised by the variable number of repeat segments. Experiments on both synthetic data and real data show that this new approach is powerful in detecting dispersed short tandem repeats. As far as we know, it is the first work to adopt RJMCMC algorithms in the detection of tandem repeats. 相似文献
19.
Differential distribution of simple sequence repeats in eukaryotic genome sequences. 总被引:34,自引:0,他引:34
Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and trinucleotide repeats in Drosophila also seemed to be longer. Although the trends for different repeats are similar between different chromosomes within a genome, the density of repeats may vary between different chromosomes of the same species. The abundance or rarity of various di- and trinucleotide repeats in different genomes cannot be explained by nucleotide composition of a sequence or potential of repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication/repair/recombination machinery might play an important role in the genesis of repeats. Moreover, analysis of complete genome coding DNA sequences of Drosophila, C. elegans, and yeast indicated that expansions of codon repeats corresponding to small hydrophilic amino acids are tolerated more, while strong selection pressures probably eliminate codon repeats encoding hydrophobic and basic amino acids. The locations and sequences of all of the repeat loci detected in genome sequences and coding DNA sequences are available at http://www.ncl-india.org/ssr and could be useful for further studies. 相似文献
20.
Sequences flanking the repeat arrays of human minisatellites: association with tandem and dispersed repeat elements. 总被引:15,自引:10,他引:15
下载免费PDF全文

We present DNA sequences flanking cloned hypervariable human minisatellites. In addition to providing confirmatory evidence that minisatellites cluster with other tandem repeats, these flanking sequences contain a high frequency of interspersed repetitive elements. These elements include a retroviral LTR-like sequence, from which one of the minisatellites appears to have expanded, and a recently described short interspersed repeat. We present our own findings concerning this element, in particular that those examples studied do not show significant evolutionary conservation, despite suggestions that the element may have a cis-acting function. 相似文献