首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
2.
A model developed for the evolving size of the repetitive part of the eukaryote genome during speciation was subjected to analytical and computer treatment. The basic assumption of the model was that two classes of repetitive DNA contribute mainly to macroevolutionary changes in genome size: arrays of tandem repeats (ATR) changing through unequal crossover and mobile genetic elements (MGE) changing presumably through an integration mechanism of the Tn- and Is-kind operating in bacteria. Within the framework of this model, the macroevolution of the MGE size is formally equivalent to that of the ATR in the particular case when shifts of chromatids have only one repeat out of register. This allowed us to consider genome size as a large set of various ATRs. The results obtained are as follows. If the duplication and deletion of repeats have unequal fixation probabilities during each speciation act, the predicted species distributions of genome size significantly deviate from the real ones; if they have equal fixation probabilities, there is a conformance between calculated and real distributions. In the latter case, the model reproduces the salient features of real distributions upon acceptance of 1) upper selective boundary nonspecifically limiting increase in genome size within the evolving taxonomic group and 2) non-neutrality of variability in genome size with respect to speciation.  相似文献   

3.
An algorithm for approximate tandem repeats.   总被引:4,自引:0,他引:4  
A perfect single tandem repeat is defined as a nonempty string that can be divided into two identical substrings, e.g., abcabc. An approximate single tandem repeat is one in which the substrings are similar, but not identical, e.g., abcdaacd. In this paper we consider two criterions of similarity: the Hamming distance (k mismatches) and the edit distance (k differences). For a string S of length n and an integer k our algorithm reports all locally optimal approximate repeats, r = umacro ?, for which the Hamming distance of umacro and ? is at most k, in O(nk log (n/k)) time, or all those for which the edit distance of umacro and ? is at most k, in O(nk log k log (n/k)) time. This paper concentrates on a more general type of repeat called multiple tandem repeats. A multiple tandem repeat in a sequence S is a (periodic) substring r of S of the form r = u(a)u', where u is a prefix of r and u' is a prefix of u. An approximate multiple tandem repeat is a multiple repeat with errors; the repeated subsequences are similar but not identical. We precisely define approximate multiple repeats, and present an algorithm that finds all repeats that concur with our definition. The time complexity of the algorithm, when searching for repeats with up to k errors in a string S of length n, is O(nka log (n/k)) where a is the maximum number of periods in any reported repeat. We present some experimental results concerning the performance and sensitivity of our algorithm. The problem of finding repeats within a string is a computational problem with important applications in the field of molecular biology. Both exact and inexact repeats occur frequently in the genome, and certain repeats occurring in the genome are known to be related to diseases in the human.  相似文献   

4.
Tolerance to acidic environments is an important property of free-living and pathogenic enteric bacteria. Salmonella enterica serovar Typhimurium possesses two general forms of inducible acid tolerance. One is evident in exponentially growing cells exposed to a sudden acid shock. The other is induced when stationary-phase cells are subjected to a similar shock. These log-phase and stationary-phase acid tolerance responses (ATRs) are distinct in that genes identified as participating in log-phase ATR have little to no effect on the stationary-phase ATR (I. S. Lee, J. L. Slouczewski, and J. W. Foster, J. Bacteriol. 176:1422-1426, 1994). An insertion mutagenesis strategy designed to reveal genes associated with acid-inducible stationary-phase acid tolerance (stationary-phase ATR) yielded two insertions in the response regulator gene ompR. The ompR mutants were defective in stationary-phase ATR but not log-phase ATR. EnvZ, the known cognate sensor kinase, and the porin genes known to be controlled by OmpR, ompC and ompF, were not required for stationary-phase ATR. However, the alternate phosphodonor acetyl phosphate appears to play a crucial role in OmpR-mediated stationary-phase ATR and in the OmpR-dependent acid induction of ompC. This conclusion was based on finding that a mutant form of OmpR, which is active even though it cannot be phosphorylated, was able to suppress the acid-sensitive phenotype of an ack pta mutant lacking acetyl phosphate. The data also revealed that acid shock increases the level of ompR message and protein in stationary-phase cells. Thus, it appears that acid shock induces the production of OmpR, which in its phosphorylated state can trigger expression of genes needed for acid-induced stationary-phase acid tolerance.  相似文献   

5.
MOTIVATION: Tandem repeats (TRs) are associated with human disease, play a role in evolution and are important in regulatory processes. Despite their importance, locating and characterizing these patterns within anonymous DNA sequences remains a challenge. In part, the difficulty is due to imperfect conservation of patterns and complex pattern structures. We study recognition algorithms for two complex pattern structures: variable length tandem repeats (VLTRs) and multi-period tandem repeats (MPTRs). RESULTS: We extend previous algorithmic research to a class of regular tandem repeats (RegTRs). We formally define RegTRs, as well as two important subclasses: VLTRs and MPTRs. We present algorithms for identification of TRs in these classes. Furthermore, our algorithms identify degenerate VLTRs and MPTRs: repeats containing substitutions, insertions and deletions. To illustrate our work, we present results of our analysis for two difficult regions in cattle and human data which reflect practical occurrences of these subclasses in GenBank sequence data. In addition, we show the applicability of our algorithmic techniques for identifying Alu sequences, gene clusters and other distant regions of similarity. We illustrate this with an example from yeast chromosome I.  相似文献   

6.
MOTIVATION: A tandem repeat in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats occur in the genomes of both eukaryotic and prokaryotic organisms. They are important in numerous fields including disease diagnosis, mapping studies, human identity testing (DNA fingerprinting), sequence homology and population studies. Although tandem repeats have been used by biologists for many years, there are few tools available for performing an exhaustive search for all tandem repeats in a given sequence. RESULTS: In this paper we describe an efficient algorithm for finding all tandem repeats within a sequence, under the edit distance measure. The contributions of this paper are two-fold: theoretical and practical. We present a precise definition for tandem repeats over the edit distance and an efficient, deterministic algorithm for finding these repeats. AVAILABILITY: The algorithm has been implemented in C++, and the software is available upon request and can be used at http://www.sci.brooklyn.cuny.edu/~sokol/trepeats. The use of this tool will assist biologists in discovering new ways that tandem repeats affect both the structure and function of DNA and protein molecules.  相似文献   

7.
The final step in the conversion of vitamin B(12) into coenzyme B(12) (adenosylcobalamin, AdoCbl) is catalyzed by ATP:cob(I)alamin adenosyltransferase (ATR). Prior studies identified the human ATR and showed that defects in its encoding gene underlie cblB methylmalonic aciduria. Here two common polymorphic variants of the ATR that are found in normal individuals are expressed in Escherichia coli, purified, and partially characterized. The specific activities of ATR variants 239K and 239M were 220 and 190 nmol min(-1) mg(-1), and their K(m) values were 6.3 and 6.9 mum for ATP and 1.2 and 1.6 mum for cob(I)alamin, respectively. These values are similar to those obtained for previously studied bacterial ATRs indicating that both human variants have sufficient activity to mediate AdoCbl synthesis in vivo. Investigations also showed that purified recombinant human methionine synthase reductase (MSR) in combination with purified ATR can convert cob(II)alamin to AdoCbl in vitro. In this system, MSR reduced cob(II)alamin to cob(I)alamin that was adenosylated to AdoCbl by ATR. The optimal stoichiometry for this reaction was approximately 4 MSR/ATR and results indicated that MSR and ATR physically interacted in such a way that the highly reactive reaction intermediate [cob(I)alamin] was sequestered. The finding that MSR reduced cob(II)alamin to cob(I)alamin for AdoCbl synthesis (in conjunction with the prior finding that MSR reduced cob(II)alamin for the activation of methionine synthase) indicates a dual physiological role for MSR.  相似文献   

8.
An efficient algorithm for detecting approximate tandem repeats in genomic sequences is presented. The algorithm is based on innovative statistical criteria to detect candidate regions which may include tandem repeats; these regions are subsequently verified by alignments based on dynamic programming. No prior information about the period size or pattern is needed. Also, the algorithm is virtually capable of detecting repeats with any period. An implementation of the algorithm is compared with the two state-of-the-art tandem repeats detection tools to demonstrate its effectiveness both on natural and synthetic data. The algorithm is available at www.cs.brown.edu/people/domanic/tandem/.  相似文献   

9.
DNA replication intermediates of three plasmids containing all or part of a modified Epstein-Barr virus cis-acting plasmid maintenance region (oriP) were examined to further investigate oriP function. Replication intermediates were analyzed in vivo and in vitro by neutral-neutral two-dimensional gel electrophoresis. The major functional components of the wild-type oriP are a 140-bp dyad symmetry region (single dyad) and 20 tandem copies of a repeat with a 30-bp consensus sequence (family of repeats). A modified oriP was constructed by replacing the family of repeats with three tandem copies of the single dyad (D. A. Wysokenski and J. L. Yates, J. Virol. 63:2657-2666, 1989). Initiation was observed in vivo near the single dyad in the modified oriP, as seen in the wild-type oriP (T. A. Gahn and C. L. Schildkraut, Cell 58:527-535, 1989), but was not observed near the tandem dyads. A replication barrier and termination were observed near the tandem dyads and were similar to those observed at the family of repeats of the wild-type oriP (Gahn and Schildkraut, Cell 58:527-535, 1989). In vitro experiments indicate that the viral trans-acting factor EBNA-1 contributes to efficient barrier formation at the tandem dyads as observed in the family of repeats of the wild-type oriP (V. Dhar and C. L. Schildkraut, Mol. Cell. Biol. 11:6268-6278, 1991). The tandem dyads thus appear to function in a manner similar to the family of repeats. There are significant structural differences between the family of repeats and tandem dyads. The relationship between the number and relative positions of EBNA-1 binding sites in relation to the functions of the family of repeats and the dyad symmetry element is discussed.  相似文献   

10.
Finding approximate tandem repeats in genomic sequences.   总被引:1,自引:0,他引:1  
An efficient algorithm is presented for detecting approximate tandem repeats in genomic sequences. The algorithm is based on a flexible statistical model which allows a wide range of definitions of approximate tandem repeats. The ideas and methods underlying the algorithm are described and its effectiveness on genomic data is demonstrated.  相似文献   

11.
We show the presence of numerous short tandem repeats in the human cytomegalovirus (HCMV) genome and assess their usefulness as molecular markers. The genome is shown to contain at least 24 microsatellite regions that exhibit length polymorphisms. Insertion-deletion polymorphisms at these short tandem repeats are common (80% of repeats examined are polymorphic among two laboratory strains and 10 clinical isolates). This is the first report of widespread microsatellite length polymorphism in a viral genome. Some regions are highly polymorphic: one was revealed by DNA sequencing to contain length variants at five closely linked sites, which combined resulted in 10 variants for this region among the 12 strains and isolates examined. This study not only provides a new molecular marker system for this virus but also extends our understanding of microsatellite polymorphism in two important ways. First, variable-length repeats in HCMV can be considerably shorter than polymorphic repeats previously found in other organisms. Second, highly variable microsatellite repeats are not confined to prokaryotes and eukaryotes, as previously assumed. This variation provides a useful marker system for distinguishing viral isolates, and similar markers are also likely to be found in other large-genome DNA viruses.  相似文献   

12.
Tandem repeats play many important roles in biological research. However, accurate characterization of their properties is limited by the inability to easily detect them. For this reason, much work has been devoted to developing detection algorithms. A widely used algorithm for detecting tandem repeats is the ‘`tandem repeats finder’' (Benson, G., Nucleic Acids Res. 27, 573--580, 1999). In that algorithm, tandem repeats are modeled by percent matches and frequency of indels between adjacent pattern copies, and statistical criteria are used to recognize them. We give a method for computing the exact joint distribution of a pair of statistics that are used in the testing procedures of the ‘`tandem repeats finder’': the total number of matches in matching tuples of length k or longer, and the total number of observations from the beginning of the first such matching tuple to the end of the last one. This allows the computation of the conditional distribution of the latter statistic given the former, a conditional distribution that is used to test for tandem repeats as opposed to non-tandem direct repeats. The setting is a Markovian sequence of a general order. Current approaches to this distributional problem deal only with independent trials and are based on approximations via simulation.  相似文献   

13.
To exert its activity, anthrax toxin must be endocytosed and its enzymatic toxic subunits delivered to the cytoplasm. It has been proposed that, in addition to the anthrax toxin receptors (ATRs), lipoprotein-receptor-related protein 6 (LRP6), known for its role in Wnt signalling, is also required for toxin endocytosis. These findings have however been challenged. We show that LRP6 can indeed form a complex with ATRs, and that this interaction plays a role both in Wnt signalling and in anthrax toxin endocytosis. We found that ATRs control the levels of LRP6 in cells, and thus the Wnt signalling capacity. RNAi against ATRs indeed led to a drastic decrease in LRP6 levels and a subsequent drop in Wnt signalling. Conversely, LRP6 plays a role in anthrax toxin endocytosis, but is not essential. We indeed found that toxin binding triggered tyrosine phosphorylation of LRP6, induced its redistribution into detergent-resistant domains, and its subsequent endocytosis. RNAis against LRP6 strongly delayed toxin endocytosis. As the physiological role of ATRs is probably to interact with the extracellular matrix, our findings raise the interesting possibility that, through the ATR-LRP6 interaction, adhesion to the extracellular matrix could locally control Wnt signalling.  相似文献   

14.
15.
Ames D  Murphy N  Helentjaris T  Sun N  Chandler V 《Genetics》2008,179(3):1693-1704
Using the compiled human genome sequence, we systematically cataloged all tandem repeats with periods between 20 and 2000 bp and defined two subsets whose consensus sequences were found at either single-locus tandem repeats (slTRs) or multilocus tandem repeats (mlTRs). Parameters compiled for these subsets provide insights into mechanisms underlying the creation and evolution of tandem repeats. Both subsets of tandem repeats are nonrandomly distributed in the genome, being found at higher frequency at many but not all chromosome ends and internal clusters of mlTRs were also observed. Despite the integral role of recombination in the biology of tandem repeats, recombination hotspots colocalized only with shorter microsatellites and not the longer repeats examined here. An increased frequency of slTRs was observed near imprinted genes, consistent with a functional role, while both slTRs and mlTRs were found more frequently near genes implicated in triplet expansion diseases, suggesting a general instability of these regions. Using our collated parameters, we identified 2230 slTRs as candidates for highly informative molecular markers.  相似文献   

16.
Sohn KH  Lei R  Nemri A  Jones JD 《The Plant cell》2007,19(12):4077-4090
The downy mildew (Hyaloperonospora parasitica) effector proteins ATR1 and ATR13 trigger RPP1-Nd/WsB- and RPP13-Nd-dependent resistance, respectively, in Arabidopsis thaliana. To better understand the functions of these effectors during compatible and incompatible interactions of H. parasitica isolates on Arabidopsis accessions, we developed a novel delivery system using Pseudomonas syringae type III secretion via fusions of ATRs to the N terminus of the P. syringae effector protein, AvrRPS4. ATR1 and ATR13 both triggered the hypersensitive response (HR) and resistance to bacterial pathogens in Arabidopsis carrying RPP1-Nd/WsB or RPP13-Nd, respectively, when delivered from P. syringae pv tomato (Pst) DC3000. In addition, multiple alleles of ATR1 and ATR13 confer enhanced virulence to Pst DC3000 on susceptible Arabidopsis accessions. We conclude that ATR1 and ATR13 positively contribute to pathogen virulence inside host cells. Two ATR13 alleles suppressed bacterial PAMP (for Pathogen-Associated Molecular Patterns)-triggered callose deposition in susceptible Arabidopsis when delivered by DC3000 DeltaCEL mutants. Furthermore, expression of another allele of ATR13 in plant cells suppressed PAMP-triggered reactive oxygen species production in addition to callose deposition. Intriguingly, although Wassilewskija (Ws-0) is highly susceptible to H. parasitica isolate Emco5, ATR13Emco5 when delivered by Pst DC3000 triggered localized immunity, including HR, on Ws-0. We suggest that an additional H. parasitica Emco5 effector might suppress ATR13-triggered immunity.  相似文献   

17.
Tandem repeats finder: a program to analyze DNA sequences.   总被引:66,自引:3,他引:63       下载免费PDF全文
A tandem repeat in DNA is two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats have been shown to cause human disease, may play a variety of regulatory and evolutionary roles and are important laboratory and analytic tools. Extensive knowledge about pattern size, copy number, mutational history, etc. for tandem repeats has been limited by the inability to easily detect them in genomic sequence data. In this paper, we present a new algorithm for finding tandem repeats which works without the need to specify either the pattern or pattern size. We model tandem repeats by percent identity and frequency of indels between adjacent pattern copies and use statistically based recognition criteria. We demonstrate the algorithm's speed and its ability to detect tandem repeats that have undergone extensive mutational change by analyzing four sequences: the human frataxin gene, the human beta T cellreceptor locus sequence and two yeast chromosomes. These sequences range in size from 3 kb up to 700 kb. A World Wide Web server interface atc3.biomath.mssm.edu/trf.html has been established for automated use of the program.  相似文献   

18.
To date, various G-quadruplex structures have been reported in the human genome. There are numerous studies focusing on quadruplex-forming sequences in general, but few studies have focused on two or more quadruplexes in the same molecule, which are most commonly found in telomeric DNA and other tandem repeats, e.g., insulin-linked polymorphic region (ILPR). Although the human telomere consists of a number of repeats, higher-order G-quadruplex structures are discussed less often because of the complexity of the structures. In this study, sequences consisting of 4-12 repeats of d(G(4)TGT), d(G(3)T(2)A), and/or d(G(4)T(2)A) have been studied by circular dichroism, ultraviolet spectroscopy, and temperature-gradient gel electrophoresis. These sequences serve as a model for the arrangement of quadruplexes in the telomere and ILPR in solution. Our major findings are as follows. (i) The number of G-rich repeats has a great influence on G-quadruplex stability. (ii) The evidence of quadruplex-quadruplex interaction is confirmed. (iii) For the first time, we directly observed the melting behavior of different conformers in a single experiment. Our results agree with other calorimetric and spectroscopic data and data obtained by single-molecule studies, atomic force microscopy, and mechanical unfolding by optical tweezers. We propose that the end of telomeres can be formed by only a few tandem quadruplexes (fewer than three). Our findings improve our understanding of the mechanism of G-quadruplex formation in long repeats in G-rich-regulating parts of genes and telomere ends.  相似文献   

19.
Some genetic diseases in human beings are dominated by short sequences repeated consecutively called tandem repeats. Once a region containing tandem repeats is found, it is of great interest to study the history of creating the repeats. The computational problem of reconstructing the duplication history of tandem repeats has been studied extensively in the literature. Almost all previous studies focused on the simplest case where the size of each duplication block is 1. Only recently we succeeded in giving the first polynomial-time approximation algorithm with a guaranteed ratio for a more general case where the size of each duplication block is at most 2; the algorithm achieves a ratio of 6 and runs in O(n^{11}) time. In this paper, we present two new polynomial-time approximation algorithms for this more general case. One of them achieves a ratio of 5 and runs in O(n^9) time, while the other achieves a ratio of 2.5+epsilon for any constant epsilon ≫ 0 but runs slower.  相似文献   

20.
Kiani C  Chen L  Lee V  Zheng PS  Wu Y  Wen J  Cao L  Adams ME  Sheng W  Yang BB 《Biochemistry》2003,42(23):7226-7237
Members of the large aggregating chondroitin sulfate proteoglycans are characterized by an N-terminal fragment known as G1 domain, which is composed of an immunoglobulin (IgG)-like motif and two tandem repeats (TR). Previous studies have indicated that the expressed product of aggrecan G1 domain was not secreted. Here we demonstrated that the inability of G1 secretion was associated with the tandem repeats but not the IgG-like motif, and specifically with TR1 of aggrecan. We also demonstrated that the G2 domain, a domain unique to aggrecan, had a similar effect on product secretion. The sequence of TR1 of G1 is highly conserved across species, which suggested similar functions played by these motifs. In a yeast two-hybrid assay, TR1 interacted with the calcium homeostasis endoplasmic reticulum protein. Deletion/mutation experiments indicated that the N-terminal fragment of TR1, in particular, the amino acids H(2)R(4) of this motif were key to its effect on product secretion. However, the N-terminal 55 amino acids were required to exert this function. Taken together, our study suggests a possible molecular mechanism for the function of the tandem repeats in product processing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号