首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We calculated occurrences of all dinucleotide and trinucleotide microsatellites in the human, mouse, and yeast genomes. The microsatellites were considered separately not only according to the repeated dinucleotide or trinucleotide and the microsatellite length but also according to the starting/terminal nucleotide. The analysis showed that dramatically non-equal amounts occurred in the human genome of microsatellites that differed only by the terminal nucleotides. For example, the 23-mer (TTG)(7)TT occurs 635 times in the human genome whereas (GTT)(7)GT is present only three times in the human genome though the two 23-mers share a 22 nucleotide sequence. The dramatically non-equal occurrences of microsatellites differing only by the terminal nucleotides are observed for most dinucleotide and trinucleotide microsatellites and in all analyzed genomes. We suppose that the strikingly non-equal genomic occurrences of these closely related microsatellites originate from conformational properties of DNA.  相似文献   

2.
本文以人腺病毒B亚种31条基因组序列及D亚种39条基因组序列为研究材料,利用ImperfectMicrosatelliteExtractor和DNAMAN软件对这些基因组序列中简单重复序列(SSR)的分布情况进行了系统性分析和比较。分析结果显示:人腺病毒B、D亚种基因组中简单重复序列的平均相对密度是十分接近的,但在不同类型SSR中分布情况又有所不同。D亚种中二型SSR明显高于B亚种,在两亚种一型SSR中(A)n、(T)n都是比较多的,而在两亚种二型SSR中的(CG/GC)n表现出了较高的偏好性。在同亚种多序列比对分析中,D亚种表现出了更高的稳定性。B、D亚种中SSR的这种特异性分布可能与它们的进化机制和致病性有关。  相似文献   

3.
Expansion of trinucleotide repeats (CAG)n and (CGG)n is found in genes responsible for certain human hereditary neurodegenerative diseases. By gel-mobility shift assay, we detected a single-stranded (AGC)n repeat-binding activity primarily in mouse brain extracts and very low or undetectable activity in other tissue extracts. Two (AGC)n-repeat binding proteins, with apparent molecular weights of 44 and 40 kDa, have been purified from mouse adult brain by a DNA affinity column and fast protein liquid chromatography. UV-cross linking of radiolabeled (AGC)n repeats with crude brain extracts and with purified two proteins of 44 and 40 kDa produced identical doublet bands, indicating that these proteins are in fact responsible for the (AGC)n-binding activity in brain extracts. We designated these two proteins TRIP-1 for the 44 kDa protein and TRIP-2 for the 40 kDa protein, where TRIP represents trinucleotide repeat-binding protein. TRIP-1 and TRIP-2 bind to a specific subset of trinucleotide repeat sequences including (AGC)n, (AGT)n, (GGC)n, and (GGT)n repeats but not to various other trinucleotide repeats. A minimum of eight (AGC) trinucleotide repeating units is required for TRIP-1 and -2 recognition and binding. The (AGC)n repeat-binding activity increases in the brain after birth and reaches a plateau within 3 weeks. In the brain, TRIP-1 and TRIP-2 may alter the function of the genes containing the expanded-trinucleotide repeats.  相似文献   

4.
Many alternative splice events result in subtle mRNA changes, and most of them occur at short-distance tandem donor and acceptor sites. The splicing mechanism of such tandem sites likely involves the stochastic selection of either splice site. While tandem splice events are frequent, it is unknown how many are functionally important. Here, we use phylogenetic conservation to address this question, focusing on tandems with a distance of 3-9 nucleotides. We show that previous contradicting results on whether alternative or constitutive tandem motifs are more conserved between species can be explained by a statistical paradox (Simpson's paradox). Applying methods that take biases into account, we found higher conservation of alternative tandems in mouse, dog, and even chicken, zebrafish, and Fugu genomes. We estimated a lower bound for the number of alternative sites that are under purifying (negative) selection. While the absolute number of conserved tandem motifs decreases with the evolutionary distance, the fraction under selection increases. Interestingly, a number of frameshifting tandems are under selection, suggesting a role in regulating mRNA and protein levels via nonsense-mediated decay (NMD). An analysis of the intronic flanks shows that purifying selection also acts on the intronic sequence. We propose that stochastic splice site selection can be an advantageous mechanism that allows constant splice variant ratios in situations where a deviation in this ratio is deleterious.  相似文献   

5.
We have previously shown that GAA trinucleotide repeats have undergone significant expansion in the human genome. Here we present the analysis of the length distribution of all 10 nonredundant trinucleotide repeat motifs in 20 complete eukaryotic genomes (6 mammalian, 2 nonmammalian vertebrates, 4 arthropods, 4 fungi, and 1 each of nematode, amoebozoa, alveolate, and plant), which showed that the abundance of large expansions of GAA trinucleotide repeats is specific to mammals. Analysis of human-chimpanzee-gorilla orthologs revealed that loci with large expansions are species-specific and have occurred after divergence from the common ancestor. PCR analysis of human controls revealed large expansions at multiple human (GAA)(30+) loci; nine loci showed expanded alleles containing >65 triplets, analogous to disease-causing expansions in Friedreich ataxia, including two that are in introns of genes of unknown function. The abundance of long GAA trinucleotide repeat tracts in mammalian genomes represents a significant mutation potential and source of interindividual variability.  相似文献   

6.
Unusual expansion of trinucleotide repeats has been identified as a common mechanism of hereditary neurodegenerative diseases. Although the actual mechanism of repeat expansion remains uncertain, trinucleotide repeat instability may be related to the increased stability of an alternative DNA hairpin structure formed in the repeat sequences. Here we report that a synthetic ligand naphthyridine carbamate dimer (NCD) selectively bound to and stabilized an intra-stranded hairpin structure in CGG repeat sequences. The NCD-CGG hairpin complex was a stable structure that efficiently interfered with DNA replication by Taq DNA polymerase. Considering the sequence preference of NCD, the use of NCD would be valuable to investigate the genetic instabilities of CGG/CCG repeat sequences in human genomes.  相似文献   

7.
Hancock JM 《Genetica》2002,115(1):93-103
The relationship between the level of repetitiveness in genomic sequences and genome size has been re-investigated making use of the rapidly growing database of complete eubacterial and archaeal genome sequences combined with the fragmentary but now large amount of data from eukaryotic genomes. Relative simplicity factors (RSFs), which measure the repetitiveness of sequences, were calculated and significantly simple motifs (SSMs), which identify the kinds of sequences that are repeated, were identified. A previously reported correlation between genome size and repetitiveness was confirmed, but it was shown that the higher RSFs seen in eukaryotic genomes also reflect a generally higher level of repetitiveness independent of genome size differences. Differences in genome size are responsible for about 10% of the variance in RSF seen between species. The spectrum of SSMs seen within a genome differed markedly within the eubacteria but less so in eukaryotes and, particularly, in archaea. Species with SSM spectra that differ from the norm tend also to have high RSFs for their genome size and to be pathogens that make use of repetitive sequences to avoid host defence responses. Some of the variance in repetitiveness seen in other species may therefore also reflect the action of selection, although other forces such as variation in the effectiveness of mechanisms for regulating slippage errors of replication, may also be important.  相似文献   

8.
The abundance and inherent potential for extensive allelic variations in simple sequence repeats (SSRs) or microsatellites resulted in valuable source for genetic markers in eukaryotes. In this study, we analyzed and compared the abundance and organisation of SSR in the genome of two important fungal pathogens of wheat, brown or leaf rust (Puccinia triticina) and black or stem rust (Puccinia graminis f. sp. tritici). P. triticina genome with two fold genome size as compared to P. graminis tritici has lower relative abundance and SSR density. The distribution pattern of different SSR motifs provides the evidence of greater accumulation of dinucleotide followed by trinucleotide repeats. More than two-hundred different types of repeat motifs were observed in the genomes. The longest SSR motifs varied in both genomes and some of the repeat motifs are found in higher frequency. The information about survey of relative abundance, relative density, length and frequency of different repeat motifs in Puccinia sp. will be useful for developing SSR markers that could find several applications in analysis of fungal genome such as genetic diversity, population genetics, race identification and acquisition of new virulence.  相似文献   

9.
Simple sequence repeats (SSRs) exist in both eukaryotic and prokaryotic genomes and are the most popular genetic markers, but the SSRs of mosquito genomes are still not well understood. In this study, we identified and analyzed the SSRs in 23 mosquito species using Drosophila melanogaster as reference at the whole-genome level. The results show that SSR numbers (33 076-560 175/genome) and genome sizes (574.57-1342.21 Mb) are significantly positively correlated (R~= 0.8992, P < 0.01), but the correlation in individual species varies in these mosquito species. In six types of SSR, mono- to trinucleotide SSRs are dominant with cumulative percentages of 95.14%-99.00% and densities of 195.65/Mb-787.51/Mb, whereas tetra- to hexanucleotide SSRs are rare with 1.12%-4.22% and 3.76/Mb-40.23/Mb. The (A/T)n,(AC/GT)n and (AGC/GCT)n are the most frequent motifs in mononucleotide, dinucleotide and trinucleotide SSRs, respectively, and the motif frequencies of tetra- to hexanucleotide SSRs appear to be species-specific. The 10-20 bp length of SSRs are dominant with the number of 11() 561 ± 93 482 and the frequency of 87.25%± 5.73% on average, and the number and frequency decline with the increase oflength. Most SSRs(83.34%± 7.72%) are located in intergenic regions, followed by intron regions (11.59%± 5.59%), exon regions (3.74%± 1.95%), and untranslated regions (1.32%± 1.39%). The mono-, di- and trinucleotide SSRs are the main SSRs in both gene regions (98.55%± 0.85%) and exon regions (99.27%± 0.52%). An average of 42.52% of total genes contains SSRs, and the preference for SSR occurrenee in different gene subcategories are species-specific. The study provides useful insights into the SSR diversity, characteristics and distribution in 23 mosquito species of genomes.  相似文献   

10.
Expanded trinucleotide repeats underlie a growing number of human diseases. The human FMR1 (CGG)(n) array can exhibit genetic instability characterized by progressive expansion over several generations leading to gene silencing and the development of the fragile X syndrome. While expansion is dependent upon the length of uninterrupted (CGG)(n), instability occurs in a limited germ line and early developmental window, suggesting that lineage-specific expression of other factors determines the cellular environment permissive for expansion. To identify these factors, we have established normal- and premutation-length human FMR1 (CGG)(n) arrays in the yeast Saccharomyces cerevisiae and assessed the frequency of length changes greater than 5 triplets in cells deficient in various DNA repair and replication functions. In contrast to previous studies with Escherichia coli, we observed a low frequency of orientation-dependent large expansions in arrays carrying long uninterrupted (CGG)(n) arrays in a wild-type background. This frequency was unaffected by deletion of several DNA mismatch repair genes or deletion of the EXO1 and DIN7 genes and was not enhanced through meiosis in a wild-type background. Array contraction occurred in an orientation-dependent manner in most mutant backgrounds, but loss of the Sgs1p resulted in a generalized increase in array stability in both orientations. In contrast, FMR1 arrays had a 10-fold-elevated frequency of expansion in a rad27 background, providing evidence for a role in lagging-strand Okazaki fragment processing in (CGG)(n) triplet repeat expansion.  相似文献   

11.
12.
MOTIVATION: Analysis of statistical properties of DNA sequences is important for evolutional biology as well as for DNA probe and PCR technologies. These technologies, in turn, can be used for organism identification, which implies applications in the diagnosis of infectious diseases, environmental studies, etc. RESULTS: We present results of the correlation analysis of distributions of the presence/absence of short nucleotide subsequences of different length ('n-mers', n = 5-20) in more than 1500 microbial and virus genomes, together with five genomes of multicellular organisms (including human). We calculate whether a given n-mer is present or absent (frequency of presence) in a given genome, which is not the usually calculated number of appearances of n-mers in one or more genomes (frequency of appearance). For organisms that are not close relatives of each other, the presence/absence of different 7-20mers in their genomes are not correlated. For close biological relatives, some correlation of the presence of n-mers in this range appears, but is not as strong as expected. Suppressed correlations among the n-mers present in different genomes leads to the possibility of using random sets of n-mers (with appropriately chosen n) to discriminate genomes of different organisms and possibly individual genomes of the same species including human with a low probability of error.  相似文献   

13.

Background  

Genome sequences vary strongly in their repetitiveness and the causes for this are still debated. Here we propose a novel measure of genome repetitiveness, the index of repetitiveness, I r, which can be computed in time proportional to the length of the sequences analyzed. We apply it to 336 genomes from all three domains of life.  相似文献   

14.
Comparative analyses between human disease and non-disease genes are of great interest in understanding human disease gene evolution. However, the progression of neurodegenerative diseases (NDD) involving amyloid formation in specific brain regions is still unknown. Therefore, in this study, we mainly focused our analysis on the evolutionary features of human NDD genes with respect to non-disease genes. Here, we observed that human NDD genes are evolutionarily conserved relative to non-disease genes. To elucidate the conserved nature of NDD genes, we incorporated the evolutionary attributes like gene expression level, number of regulatory miRNAs, protein connectivity, intrinsic disorder content and relative aggregation propensity in our analysis. Our studies demonstrate that NDD genes have higher gene expression levels in favor of their lower evolutionary rates. Additionally, we observed that NDD genes have higher number of different regulatory miRNAs target sites and also have higher interaction partners than the non-disease genes. Moreover, miRNA targeted genes are known to have higher disorder content. In contrast, our analysis exclusively established that NDD genes have lower disorder content. In favor of our analysis, we found that NDD gene encoded proteins are enriched with multi interface hubs (party hubs) with lower disorder contents. Since, proteins with higher disorder content need to adapt special structure to reduce their aggregation propensity, NDD proteins found to have elevated relative aggregation propensity (RAP) in support of their lower disorder content. Finally, our categorical regression analysis confirmed the underlined relative dominance of protein connectivity, 3′UTR length, RAP, nature of hubs (singlish/multi interface) and disorder content for such evolutionary rates variation between human NDD genes and non-disease genes.  相似文献   

15.
Molecular mechanisms responsible for the genetic instability of DNA trinucleotide sequences (TRS) account for at least 20 human hereditary disorders. Many aspects of DNA metabolism influence the frequency of length changes in such repeats. Herein, we demonstrate that expression of Escherichia coli SOS repair proteins dramatically decreases the genetic stability of long (CTG/CAG)n tracts contained in plasmids. Furthermore, the growth characteristics of the bacteria are affected by the (CTG/CAG)n tract, with the effect dependent on the length of the TRS. In an E. coli host strain with constitutive expression of the SOS regulon, the frequency of deletions to the repeat is substantially higher than that in a strain with no SOS response. Analyses of the topology of reporter plasmids isolated from the SOS+ and SOS- strains revealed higher levels of negative supercoiling in strains with the constitutively expressed SOS network. Hence, we used strains with mutations in topoisomerases to examine the effect of DNA topology upon the TRS instability. Higher levels of negative DNA supercoiling correlated with increased deletions in long (CTG/CAG)n, (CGG/CCG)n and (GAA/TTC)n. These observations suggest a link between the induction of bacterial SOS repair, changes in DNA topology and the mechanisms leading to genetic instability of repetitive DNA sequences.  相似文献   

16.
S Trivedi  JM Hancock 《Gene》2012,508(1):73-77
The locations of microsatellites in mammalian genomes are restricted by purifying selection in a number of ways. For example, with the exception of some trinucleotide repeats they are excluded from protein coding regions of genomes because of their tendency to cause frameshift mutations. Here we investigate whether purifying selection might affect the types and frequencies of microsatellites in microRNA (miRNA). We concentrate on miRNAs expressed in neurons and the brain (NB-miRNAs) as microsatellites in these genes might give rise to similar effects as disease-causing repeats in protein coding genes. We show that in human miRNAs in general AG and AT microsatellites are reduced in frequency compared to AC repeats and that NB-miRNA genes contain significantly fewer microsatellites than expected from frequencies of microsatellites in other miRNA genes. NB-miRNAs show lower levels of sequence divergence in comparisons of human-macaque orthologues and more often have detectable orthologues in non-human mammals than non-NB-miRNAs. This suggests that microsatellites in miRNAs may indeed be constrained by purifying selection and that the strength of this selection may differ between NB-miRNAs and non-NB-miRNAs. We identify a number of ways in which the potential disruption of pre-miRNA secondary structure might result in purifying selection. However other, non-selective forces could also play a role in generating the biases observed in miRNA microsatellites.  相似文献   

17.
18.
Our thesis is that the DNA composition and structure of genomes are selected in part by mutation bias (GC pressure) and in part by ecology. To illustrate this point, we compare and contrast the oligonucleotide composition and the mosaic structure in 36 complete genomes and in 27 long genomic sequences from archaea and eubacteria. We report the following findings (1) High-GC-content genomes show a large underrepresentation of short distances between G(n) and C(n) homopolymers with respect to distances between A(n) and T(n) homopolymers; we discuss selection versus mutation bias hypotheses. (2) The oligonucleotide compositions of the genomes of Neisseria (meningitidis and gonorrhoea), Helicobacter pylori and Rhodobacter capsulatus are more biased than the other sequenced genomes. (3) The genomes of free-living species or nonchronic pathogens show more mosaic-like structure than genomes of chronic pathogens or intracellular symbionts. (4) Genome mosaicity of intracellular parasites has a maximum corresponding to the average gene length; in the genomes of free-living and nonchronic pathogens the maximum occurs at larger length scales. This suggests that free-living species can incorporate large pieces of DNA from the environment, whereas for intracellular parasites there are recombination events between homologous genes. We discuss the consequences in terms of evolution of genome size. (5) Intracellular symbionts and obligate pathogens show small, but not zero, amount of chromosome mosaicity, suggesting that recombination events occur in these species.  相似文献   

19.
 The objective of this work was to assess the degree of trinucleotide microsatellite length polymorphism in the selfing species Arabidopsis thaliana. PCR amplifications of 12 microsatellite loci among 49 natural populations revealed between one to eight length variants (alleles) for each locus. The average number of alleles per locus was four and the average genetic diversity index was 0.43. Divergence between length variants was investigated at the nucleotide level. Several observations emerge from the sequence data: (1) for most loci, length polymorphism results only from variations in the number of trinucleotide repeats; (2) for a few others, some variability was noted in the flanking sequences; (3) for compound and interrupted loci containing two arrays of trinucleotide repeats, length variations preferentially affect the longest one. Five of the Arabidopsis thaliana accessions were clearly composed of two sublines. In 2 other accessions, some heterozygous individual plants, probably resulting from recent outcrosses, were found. A phylogenetic tree constructed on the basis of trinucleotide microsatellite allelic diversity shows that genetic relationships among the accessions are not correlated with their geographic origin. Received: 4 November 1997 / Accepted: 3 March 1998  相似文献   

20.
Huntington disease (HD) is an autosomal dominant degenerative disorder caused by an expanded and unstable trinucleotide repeat (CAG)n in a gene (IT-15) on chromosome 4. HD exhibits genetic anticipation—earlier onset in successive generations within a pedigree. From a population-based clinical sample, we ascertained parent-offspring pairs with expanded alleles, to examine the intergenerational behavior of the trinucleotide repeat and its relationship to anticipation. We find that the change in repeat length with paternal transmission is significantly correlated with the change in age at onset between the father and offspring. When expanded triplet repeats of affected parents are separated by median repeat length, we find that the longer paternal and maternal repeats are both more unstable on transmission. However, unlike in paternal transmission, in which longer expanded repeats display greater net expansion than do shorter expanded repeats, in maternal transmission there is no mean change in repeat length for either longer or shorter expanded repeats. We also confirmed the inverse relationship between repeat length and age at onset, the higher frequency of juvenile-onset cases arising from paternal transmission, anticipation as a phenomenon of paternal transmission, and greater expansion of the trinucleotide repeat with paternal transmission. Stepwise multiple regression indicates that, in addition to repeat length of offspring, age at onset of affected parent and sex of affected parent contribute significantly to the variance in age at onset of the offspring. Thus, in addition to triplet repeat length, other factors, which could act as environmental factors, genetic factors, or both, contribute to age at onset. Our data establish that further expansion of paternal repeats within the affected range provides a biological basis of anticipation in HD.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号