首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
We explored the possibilities of whole-genome duplication (WGD) in prokaryotic species,where we performed statistical analyses of the configurations of the central angles between homologous tandem repeats (TRs) on the circular chromosomes.At first,we detected TRs on their chromosomes and identified equivalent tandem repeat pairs (ETRPs); here,an ETRP is defined as a pair of tandem repeats sequentially similar to each other.Then we carried out statistical analyses of the central angle distributions of the de...  相似文献   

2.
Genome variation studies in Plasmodium falciparum have focused on SNPs and, more recently, large-scale copy number polymorphisms and ectopic rearrangements. Here, we examine another source of variation: variable number tandem repeats (VNTRs). Interspersed low complexity features, including the well-studied P. falciparum microsatellite sequences, are commonly classified as VNTRs; however, this study is focused on longer coding VNTR polymorphisms, a small class of copy number variations. Selection against frameshift mutation is a main constraint on tandem repeats (TRs) in coding regions, while limited propagation of TRs longer than 975 nt total length is a minor restriction in coding regions. Comparative analysis of three P. falciparum genomes reveals that more than 9% of all P. falciparum ORFs harbor VNTRs, much more than has been reported for any other species. Moreover, genotyping of VNTR loci in a drug-selected line, progeny of a genetic cross, and 334 field isolates demonstrates broad variability in these sequences. Functional enrichment analysis of ORFs harboring VNTRs identifies stress and DNA damage responses along with chromatin modification activities, suggesting an influence on genome mutability and functional variation. Analysis of the repeat units and their flanking regions in both P. falciparum and Plasmodium reichenowi sequences implicates a replication slippage mechanism in the generation of TRs from an initially unrepeated sequence. VNTRs can contribute to rapid adaptation by localized sequence duplication. They also can confound SNP-typing microarrays or mapping short-sequence reads and therefore must be accounted for in such analyses.  相似文献   

3.
Expansion or shrinkage of existing tandem repeats (TRs) associated with various biological processes has been actively studied in both prokaryotic and eukaryotic genomes, while their origin and biological implications remain mostly unknown. Here we describe various duplications (de novo TRs) that occurred in the coding region of a β-lactamase gene, where a conserved structure called the omega loop is encoded. These duplications that occurred under selection using ceftazidime conferred substrate spectrum extension to include the antibiotic. Under selective pressure with one of the original substrates (amoxicillin), a high level of reversion occurred in the mutant β-lactamase genes completing a cycle back to the original substrate spectrum. The de novo TRs coupled with reversion makes a genetic toggling mechanism enabling reversible switching between the two phases of the substrate spectrum of β-lactamases. This toggle exemplifies the effective adaptation of de novo TRs for enhanced bacterial survival. We found pairs of direct repeats that mediated the DNA duplication (TR formation). In addition, we found different duos of sequences that mediated the DNA duplication. These novel elements—that we named SCSs (same-strand complementary sequences)—were also found associated with β-lactamase TR mutations from clinical isolates. Both direct repeats and SCSs had a high correlation with TRs in diverse bacterial genomes throughout the major phylogenetic lineages, suggesting that they comprise a fundamental mechanism shaping the bacterial evolution.  相似文献   

4.
Exact Tandem Repeats Analyzer 1.0 (E-TRA) combines sequence motif searches with keywords such as ‘organs’, ‘tissues’, ‘cell lines’ and ‘development stages’ for finding simple exact tandem repeats as well as non-simple repeats. E-TRA has several advanced repeat search parameters/options compared to other repeat finder programs as it not only accepts GenBank, FASTA and expressed sequence tags (EST) sequence files, but also does analysis of multiple files with multiple sequences. The minimum and maximum tandem repeat motif lengths that E-TRA finds vary from one to one thousand. Advanced user defined parameters/options let the researchers use different minimum motif repeats search criteria for varying motif lengths simultaneously. One of the most interesting features of genomes is the presence of relatively short tandem repeats (TRs). These repeated DNA sequences are found in both prokaryotes and eukaryotes, distributed almost at random throughout the genome. Some of the tandem repeats play important roles in the regulation of gene expression whereas others do not have any known biological function as yet. Nevertheless, they have proven to be very beneficial in DNA profiling and genetic linkage analysis studies. To demonstrate the use of E-TRA, we used 5,465,605 human EST sequences derived from 18,814,550 GenBank EST sequences. Our results indicated that 12.44% (679,800) of the human EST sequences contained simple and non-simple repeat string patterns varying from one to 126 nucleotides in length. The results also revealed that human organs, tissues, cell lines and different developmental stages differed in number of repeats as well as repeat composition, indicating that the distribution of expressed tandem repeats among tissues or organs are not random, thus differing from the un-transcribed repeats found in genomes.  相似文献   

5.
MOTIVATION: One of the most interesting features of genomes (both coding and non-coding regions) is the presence of relatively short tandemly repeated DNA sequences known as tandem repeats (TRs). We developed a new PC-based stand-alone software analysis program, combining sequence motif searches with keywords such as organs, tissues, cell lines or development stages for finding exact, inexact and compound, TRs. Tandem Repeats Analyzer 1.5 (TRA) has several advanced repeat search parameters/options over other repeat finder programs as it does not only accept GenBank, FASTA and expressed sequence tag (EST) sequence files but also does analysis of multifiles with multisequences. Advanced user-defined parameters/options let the researchers use different motif lengths search criteria for varying motif lengths simultaneously. The outputs show statistical results to be evaluated by the user. The discovery of TRs in ESTs could be useful for both gene mapping and association studies and discovering TRs located in coding regions of important genes that are expressed under various conditions of environment, stress, organ, tissue and development stage. RESULTS: In this paper, we demonstrated applications of TRA using 175 899 ESTs sequences for three Arabidopsis spp. downloaded from GenBank. The EST-SSRs/ESTs ratios were found 43.1%, 15.3% and 2.34% in A.lyrata, A.thaliana and A.halleri, respectively. Analysis revealed that organs, tissues and development stages possessed different amounts of repeats and repeat compositions. This indicated that the distribution of TRs among the tissues or organs may not be random differing from the untranscribed repeats found in genomes. AVAILABILITY: The program can be obtained free by anonymous FTP from ftp.akdeniz.edu.tr/Araclar/TRA.  相似文献   

6.
Microsatellites or Simple Sequence Repeats (SSRs) are tandem iterations of one to six base pairs, non-randomly distributed throughout prokaryotic and eukaryotic genomes. Limited knowledge is available about distribution of microsatellites in single stranded DNA (ssDNA) viruses, particularly vertebrate infecting viruses. We studied microsatellite distribution in 118 ssDNA virus genomes belonging to three families of vertebrate infecting viruses namely Circoviridae, Parvoviridae, and Anelloviridae, and found that microsatellites constitute an important component of these virus genomes. Mononucleotide repeats were predominant followed by dinucleotide and trinucleotide repeats. A strong positive relationship existed between number of mononucleotide repeats and genome size among all the three virus families. A similar relationship existed for the occurrence of DTTPH (di-, tri-, tetra-, penta- and hexa-nucleotide) repeats in the families Anelloviridae and Parvoviridae only. Relative abundance and relative density of mononucleotide repeats showed a strong positive relationship with genome size in Circoviridae and Parvoviridae. However, in the case of DTTPH repeats, these features showed a strong relationship with genome size in Circoviridae only. On the other hand, relative microsatellite abundance and relative density of mononucleotide repeats were negatively correlated with GC content (%) in Parvoviridae genomes. On the basis of available annotations, our analysis revealed maximum occurrence of mononucleotide as well as DTTPH repeats in the coding regions of these virus genomes. Interestingly, after normalizing the length of the coding and non-coding regions of each virus genome, we found relative density of microsatellites much higher in the non-coding regions. We understand that the present study will help in the better characterization of the stability, genome organization and evolution of these virus classes and may provide useful leads to decipher the etiopathogenesis of these viruses.  相似文献   

7.

Background

Ancestral reconstructions of mammalian genomes have revealed that evolutionary breakpoint regions are clustered in regions that are more prone to break and reorganize. What is still unclear to evolutionary biologists is whether these regions are physically unstable due solely to sequence composition and/or genome organization, or do they represent genomic areas where the selection against breakpoints is minimal.

Methodology and Principal Findings

Here we present a comprehensive study of the distribution of tandem repeats in great apes. We analyzed the distribution of tandem repeats in relation to the localization of evolutionary breakpoint regions in the human, chimpanzee, orangutan and macaque genomes. We observed an accumulation of tandem repeats in the genomic regions implicated in chromosomal reorganizations. In the case of the human genome our analyses revealed that evolutionary breakpoint regions contained more base pairs implicated in tandem repeats compared to synteny blocks, being the AAAT motif the most frequently involved in evolutionary regions. We found that those AAAT repeats located in evolutionary regions were preferentially associated with Alu elements.

Significance

Our observations provide evidence for the role of tandem repeats in shaping mammalian genome architecture. We hypothesize that an accumulation of specific tandem repeats in evolutionary regions can promote genome instability by altering the state of the chromatin conformation or by promoting the insertion of transposable elements.  相似文献   

8.
Microsatellites are abundant across prokaryotic and eukaryotic genomes. However, comparative analysis of microsatellites in the organellar genomes of plants and their utility in understanding phylogeny has not been reported. The purpose of this study was to understand the organization of microsatellites in the coding and non-coding regions of organellar genomes of major cereals viz., rice, wheat, maize and sorghum. About 5.8-14.3% of mitochondrial and 30.5-43.2% of chloroplast microsatellites were observed in the coding regions. About 83.8-86.8% of known mitochondrial genes had at least one microsatellite while this value ranged from 78.6-82.9% among the chloroplast genomes. Dinucleotide repeats were the most abundant in the coding and non-coding regions of the mitochondrial genome while mononucleotides were predominant in chloroplast genomes. Maize harbored more repeats in the mitochondrial genome, which could be due to the larger size of genome. A phylogenetic analysis based on mitochondrial and chloroplast genomic microsatellites revealed that rice and sorghum were closer to each other, while wheat was the farthest and this corroborated with the earlier reported phylogenies based on nuclear genome co-linearity and chloroplast gene-based analysis.  相似文献   

9.
Genome plasticity is considered as a means for bacteria to adapt to their environment. Plasticity in tandem repeat sequences on bacterial genomes has been recently exploited to trace the epidemiology of pathogens. Here, we examine the utility of minisatellite (i.e., a repeat unit of six nucleotides or more) typing in non-pathogenic food bacteria of the species Lactococcus lactis. Thirty-four minisatellites identified on the sequenced L. lactis ssp. lactis strain IL1403 genome were first analyzed in 10 closely related ssp. lactis strains, as determined by randomly amplified polymorphic DNA (RAPD). The selected tandem repeats varied in length, percent identity between repeats, and locations. We showed that: (i) the greatest polymorphism was in orfs encoding exported proteins or in intergenic regions; (ii) two thirds of minisatellites were little- or non-variable, despite as much as 90% identity between tandem repeats; and (iii) dendrograms based on either RAPD or minisatellite analyses were similar. Seven minisatellites identified in this study are potentially useful for lactococcal typing. We then asked whether tandem repeats in L. lactis were stable upon very long-term (up to two years) storage. Despite large rearrangements previously reported in derivative strains, just one of 10 minisatellites tested underwent an alteration, suggesting that tandem repeat rearrangements probably occur during active DNA replication. We conclude that multiple locus minisatellite analysis can be a valuable tool to follow lactococcal strain diversity.  相似文献   

10.
Whole-genome duplication (WGD) is believed to be one of the major evolutionary events that shaped the genome organization of vertebrates. Here, we review recent research on vertebrate genome evolution, specifically on WGD and its consequences for gene and genome evolution in teleost fishes. Recent genome analyses confirmed that all vertebrates experienced two rounds of WGD early in their evolution, and that teleosts experienced a subsequent additional third-round (3R)-WGD. The 3R-WGD was estimated to have occurred 320–400 million years ago in a teleost ancestor, but after its divergence from a common ancestor with living non-teleost actinopterygians (Bichir, Sturgeon, Bowfin, and Gar) based on the analyses of teleost-specific duplicate genes. This 3R-WGD was confirmed by synteny analysis and ancestral karyotype inference using the genome sequences of Tetraodon and medaka. Most of the tetrapods, on the other hand, have not experienced an additional WGD; however, they have experienced repeated chromosomal rearrangements throughout the whole genome. Therefore, different types of chromosomal events have characterized the genomes of teleosts and tetrapods, respectively. The 3R-WGD is useful to investigate the consequences of WGD because it is an evolutionarily recent WGD and thus teleost genomes retain many more WGD-derived duplicates and “traces” of their evolution. In addition, the remarkable morphological, physiological, and ecological diversity of teleosts may facilitate understanding of macrophenotypic evolution on the basis of genetic/genomic information. We highlight the teleosts with 3R-WGD as unique models for future studies on ecology and evolution taking advantage of emerging genomics technologies and systems biology environments.  相似文献   

11.
Although satellite DNAs are well-explored components of heterochromatin and centromeres, little is known about emergence, dispersal and possible impact of comparably structured tandem repeats (TRs) on the genome-wide scale. Our bioinformatics analysis of assembled Tribolium castaneum genome disclosed significant contribution of TRs in euchromatic chromosomal arms and clear predominance of satellite DNA-typical 170 bp monomers in arrays of ≥5 repeats. By applying different experimental approaches, we revealed that the nine most prominent TR families Cast1–Cast9 extracted from the assembly comprise ∼4.3% of the entire genome and reside almost exclusively in euchromatic regions. Among them, seven families that build ∼3.9% of the genome are based on ∼170 and ∼340 bp long monomers. Results of phylogenetic analyses of 2500 monomers originating from these families show high-sequence dynamics, evident by extensive exchanges between arrays on non-homologous chromosomes. In addition, our analysis shows that concerted evolution acts more efficiently on longer than on shorter arrays. Efficient genome-wide distribution of nine TR families implies the role of transposition only in expansion of the most dispersed family, and involvement of other mechanisms is anticipated. Despite similarities in sequence features, FISH experiments indicate high-level compartmentalization of centromeric and euchromatic tandem repeats.  相似文献   

12.
MOTIVATION: Tandem repeats (TRs) are associated with human disease, play a role in evolution and are important in regulatory processes. Despite their importance, locating and characterizing these patterns within anonymous DNA sequences remains a challenge. In part, the difficulty is due to imperfect conservation of patterns and complex pattern structures. We study recognition algorithms for two complex pattern structures: variable length tandem repeats (VLTRs) and multi-period tandem repeats (MPTRs). RESULTS: We extend previous algorithmic research to a class of regular tandem repeats (RegTRs). We formally define RegTRs, as well as two important subclasses: VLTRs and MPTRs. We present algorithms for identification of TRs in these classes. Furthermore, our algorithms identify degenerate VLTRs and MPTRs: repeats containing substitutions, insertions and deletions. To illustrate our work, we present results of our analysis for two difficult regions in cattle and human data which reflect practical occurrences of these subclasses in GenBank sequence data. In addition, we show the applicability of our algorithmic techniques for identifying Alu sequences, gene clusters and other distant regions of similarity. We illustrate this with an example from yeast chromosome I.  相似文献   

13.
Inverted repeats have been found to occur in both prokaryotic and eukaryotic genomes. Usually they are short and some have important functions in various biological processes. However, long inverted repeats are rare and can cause genome instability. Analyses of C. elegans genome identified long, nearly-perfect inverted repeat sequences involving both divergently and convergently oriented homologous gene pairs and complete intergenic sequences. Comparisons with the orthologous regions from the genomes of C. briggsae and C. remanei show that the inverted repeat structures are often far more conserved than the sequences. This observation implies that there is an active mechanism for maintaining the inverted repeat nature of the sequences.  相似文献   

14.
Complete archaeal genomes were probed for the presence of long (> or = 25 bp) oligonucleotide repeats (words). We detected the presence of many words distributed in tandem with narrow ranges of periodicity (i.e., spacer length between repeats). Similar words were not identified in genomes of non-archaeal species, namely Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Mycoplasma genitalium and Mycoplasma pneumoniae. BLAST similarity searches against the GenBank nucleotide sequence database revealed that these words were archaeal species-specific, indicating that they are of a signature character. Sequence analysis and genome viewing tools showed these repeats to be restricted to non-coding regions. Thus, archaea appear to possess a non-coding genomic signature that is absent in bacterial species. The identification of a species-specific genomic signature would be of great value to archaeal genome mapping, evolutionary studies and analyses of genome complexity.  相似文献   

15.
The mitochondrial genome of the Komodo dragon (Varanus komodoensis) was nearly completely sequenced, except for two highly repetitive noncoding regions. An efficient sequencing method for squamate mitochondrial genomes was established by combining the long polymerase chain reaction (PCR) technology and a set of reptile-oriented primers designed for nested PCR amplifications. It was found that the mitochondrial genome had novel gene arrangements in which genes from NADH dehydrogenase subunit 6 to proline tRNA were extensively shuffled with duplicate control regions. These control regions had 99% sequence similarity over 700 bp. Although snake mitochondrial genomes are also known to possess duplicate control regions with nearly identical sequences, the location of the second control region suggested independent occurrence of the duplication on lineages leading to snakes and the Komodo dragon. Another feature of the mitochondrial genome of the Komodo dragon was the considerable number of tandem repeats, including sequences with a strong secondary structure, as a possible site for the slipped-strand mispairing in replication. These observations are consistent with hypotheses that tandem duplications via the slipped-strand mispairing may induce mitochondrial gene rearrangements and may serve to maintain similar copies of the control region.  相似文献   

16.
Mononucleotide repeats (MNRs) have been systematically investigated in the genomes of eukaryotic and prokaryotic organisms. However, detailed information on the distribution of MNRs in viral genomes is limited. In this study, we examined the distributions of MNRs in 256 fully sequenced virus genomes which showed extensive variations across viral genomes, and is significantly influenced by both genome size and CG content. Furthermore, the ratio of the observed to the expected number of MNRs (O/E ratio) appears to be influenced by both the host range and genome type of a particular virus. Additionally, the densities and frequencies of MNRs in genic regions are lower than in non-coding regions, suggesting that selective pressure acts on viral genomes. We also discuss the potential functional roles that these MNR loci could play in virus genomes. To our knowledge, this is the first analysis focusing on MNRs in viruses, and our study could have potential implications for a deeper understanding of virus genome stability and the co-evolution that occurs between a virus and its host.  相似文献   

17.
After the dog genome was sequenced, an increasing number of studies involving genetic research of dogs have been conducted to understand gene functions and mammalian evolution. To study the genetic diversity in dogs and other mammals, genetic markers linked to function and conserved in wide lineages are necessary. Thus far, few polymorphic markers have been used in dogs. In this study, we surveyed the entire dog genome and predicted a total of 109 tandem repeats (TRs) located on the protein coding region that may be polymorphic by our prediction model. We selected 10 TRs that may be related to neurophysiology and neural developments, and tested them in 167 individuals of 8 dog breeds: 5 European dog breeds (Beagle, Golden Retriever, Labrador Retriever, German Shepherd, and Toy Poodle) and 3 Japanese dog breeds (Japanese Spitz, Shiba, and Shikoku). Among the tested TRs, nine were polymorphic indicating that 90% of the TRs were successfully predicted to be polymorphic. PCR fragments of the TRs were amplified from dog brain cDNA, showing their expression in the dog brain. Our results provide abundant opportunities for the study of phenotypic variations in dogs, and our prediction method for variable number of tandem repeats (VNTRs) can be applied to any other animal genome sequences for the survey of functional and polymorphic markers.  相似文献   

18.
Simple sequence repeats (SSRs) composed of extensive tandem iterations of a single nucleotide or a short oligonucleotide are rare in most bacterial genomes, but they are common among Mycoplasma. Some of these repeats act as contingency loci in association with families of surface antigens. By contraction or expansion during replication, these SSRs increase genetic variance of the population and facilitate avoidance of the immune response of the host. Occurrence and distribution of SSRs are analyzed in complete genomes of 11 Mycoplasma and 3 related Mollicutes in order to gain insights into functional and evolutionary diversity of the SSRs in Mycoplasma. The results revealed an unexpected variety of SSRs with respect to their distribution and composition and suggest that it is unlikely that all SSRs function as contingency loci or recombination hot spots. Various types of SSRs are most abundant in Mycoplasma hyopneumoniae, whereas Mycoplasma penetrans, Mycoplasma mobile, and Mycoplasma synoviae do not contain unusually long SSRs. Mycoplasma hyopneumoniae and Mycoplasma pulmonis feature abundant short adenine and thymine runs periodically spaced at 11 and 12 bp, respectively, which likely affect the supercoiling propensities of the DNA molecule. Physiological roles of long adenine and thymine runs in M. hyopneumoniae appear independent of location upstream or downstream of genes, unlike contingency loci that are typically located in protein-coding regions or upstream regulatory regions. Comparisons among 3 M. hyopneumoniae strains suggest that the adenine and thymine runs are rarely involved in genome rearrangements. The results indicate that the SSRs in the Mycoplasma genomes play diverse roles, including modulating gene expression as contingency loci, facilitating genome rearrangements via recombination, affecting protein structure and possibly protein-protein interactions, and contributing to the organization of the DNA molecule in the cell.  相似文献   

19.
DNA tandem repeats (TRs) are ubiquitous genomic features which consist of two or more adjacent copies of an underlying pattern sequence. The copies may be identical or approximate. Variable number of tandem repeats or VNTRs are polymorphic TR loci in which the number of pattern copies is variable. In this paper we describe VNTRseek, our software for discovery of minisatellite VNTRs (pattern size ≥ 7 nucleotides) using whole genome sequencing data. VNTRseek maps sequencing reads to a set of reference TRs and then identifies putative VNTRs based on a discrepancy between the copy number of a reference and its mapped reads. VNTRseek was used to analyze the Watson and Khoisan genomes (454 technology) and two 1000 Genomes family trios (Illumina). In the Watson genome, we identified 752 VNTRs with pattern sizes ranging from 7 to 84 nt. In the Khoisan genome, we identified 2572 VNTRs with pattern sizes ranging from 7 to 105 nt. In the trios, we identified between 2660 and 3822 VNTRs per individual and found nearly 100% consistency with Mendelian inheritance. VNTRseek is, to the best of our knowledge, the first software for genome-wide detection of minisatellite VNTRs. It is available at http://orca.bu.edu/vntrseek/.  相似文献   

20.
Complete eukaryote chromosomes were investigated for intrachromosomal duplications of nucleotide sequences. The analysis was performed by looking for nonexact repeats on two complete genomes, Saccharomyces cerevisiae and Caenorhabditis elegans, and four partial ones, Drosophila melanogaster, Plasmodium falciparum, Arabidopsis thaliana, and Homo sapiens. Through this analysis, we show that all eukaryote chromosomes exhibit similar characteristics for their intrachromosomal repeats, suggesting similar dynamics: many direct repeats have their two copies physically close together, and these close direct repeats are more similar and shorter than the other repeats. On the contrary, there are almost no close inverted repeats. These results support a model for the dynamics of duplication. This model is based on a continuous genesis of tandem repeats and implies that most of the distant and inverted repeats originate from these tandem repeats by further chromosomal rearrangements (insertions, inversions, and deletions). Remnants of these predicted rearrangements have been brought out through fine analysis of the chromosome sequence. Despite these dynamics, shared by all eukaryotes, each genome exhibits its own style of intrachromosomal duplication: the density of repeated elements is similar in all chromosomes issued from the same genome, but is different between species. This density was further related to the relative rates of duplication, deletion, and mutation proper to each species. One should notice that the density of repeats in the X chromosome of C. elegans is much lower than in the autosomes of that organism, suggesting that the exchange between homologous chromosomes is important in the duplication process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号