首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MOTIVATION: Recombination plays an important role in the evolution of many pathogens, such as HIV or malaria. Despite substantial prior work, there is still a pressing need for efficient and effective methods of detecting recombination and analyzing recombinant sequences. RESULTS: We introduce Recco, a novel fast method that, given a multiple sequence alignment, scores the cost of obtaining one of the sequences from the others by mutation and recombination. The algorithm comes with an illustrative visualization tool for locating recombination breakpoints. We analyze the sequence alignment with respect to all choices of the parameter alpha weighting recombination cost against mutation cost. The analysis of the resulting cost curve yields additional information as to which sequence might be recombinant. On random genealogies Recco is comparable in its power of detecting recombination with the algorithm Geneconv (Sawyer, 1989). For specific relevant recombination scenarios Recco significantly outperforms Geneconv.  相似文献   

2.
3.
Hao W 《Gene》2011,481(2):57-64
The evolution of influenza viruses is remarkably dynamic. Influenza viruses evolve rapidly in sequence and undergo frequent reassortment of different gene segments. Homologous recombination, although commonly seen as an important component of dynamic genome evolution in many other organisms, is believed to be rare in influenza. In this study, 256 gene segments from 32 influenza A genomes were examined for homologous recombination, three recombinant H1N1 strains were detected and they most likely resulted from one recombination event between two closely rated parental sequences. These findings suggest that homologous recombination in influenza viruses tends to take place between strains sharing high sequence similarity. The three recombinant strains were isolated at different time periods and they form a clade, indicating that recombinant strains could circulate. In addition, the simulation results showed that many recombinant sequences might not be detectable by currently existing recombinant detection programs when the parental sequences are of high sequence similarity. Finally, possible ways were discussed to improve the accuracy of the detection for recombinant sequences in influenza.  相似文献   

4.
GARD: a genetic algorithm for recombination detection   总被引:6,自引:0,他引:6  
MOTIVATION: Phylogenetic and evolutionary inference can be severely misled if recombination is not accounted for, hence screening for it should be an essential component of nearly every comparative study. The evolution of recombinant sequences can not be properly explained by a single phylogenetic tree, but several phylogenies may be used to correctly model the evolution of non-recombinant fragments. RESULTS: We developed a likelihood-based model selection procedure that uses a genetic algorithm to search multiple sequence alignments for evidence of recombination breakpoints and identify putative recombinant sequences. GARD is an extensible and intuitive method that can be run efficiently in parallel. Extensive simulation studies show that the method nearly always outperforms other available tools, both in terms of power and accuracy and that the use of GARD to screen sequences for recombination ensures good statistical properties for methods aimed at detecting positive selection. AVAILABILITY: Freely available http://www.datamonkey.org/GARD/  相似文献   

5.
M Ponzi  T Pace  E Dore  L Picci  E Pizzi    C Frontali 《Nucleic acids research》1992,20(17):4491-4497
The dynamics of telomere turnover were studied in Plasmodium, whose telomeric structures consist of linear, recognisable sequences of two distinct repeats (TTTAGGG and TTCAGGG). Independent recombinant clones containing a well-defined chromosomal extremity of Plasmodium berghei, both before and after a rare insertion event took place, were obtained from clonal parasite populations and analysed. The insertion, which splits the original telomere and causes a significant reduction in the size of the telomeric structure, is shown to consist of an integer number of subtelomeric repeats typical of P.berghei, flanked on both sides by telomere-derived motifs. Analysis of the telomeric repeat sequence heterogeneity in the otherwise homogeneous populations examined, is compatible with a model in which diversification of a given telomere is driven by the occurrence of breakpoints whose frequency rapidly increases along the telomeric tract when moving in the outward direction. The breakpoints might be due either to terminal deletions followed by random serial addition of the two repeat versions, or to recombination events. The shortening/elongation mechanism is favoured against the recombination hypothesis because of the absence of higher-order patterns in the sequence of telomeric repeats.  相似文献   

6.
Phylogenetic evidence for recombination in dengue virus.   总被引:15,自引:0,他引:15  
A split decomposition analysis of dengue (DEN) virus gene sequences revealed extensive networked evolution, indicative of recombination, among DEN-1 strains but not within serotypes DEN-2, DEN-3, or DEN-4. Within DEN-1, two viruses sampled from South America in the last 10 years were identified as recombinants. To map the breakpoints and test their statistical support, we developed a novel maximum likelihood method. In both recombinants, the breakpoints were found to be in similar positions, within the fusion peptide of the envelope protein, demonstrating that a single recombination event occurred prior to the divergence of these two strains. This is the first report of recombination in natural populations of dengue virus.  相似文献   

7.
Human enteroviruses consist of more than 60 serotypes, reflecting a wide range of evolutionary divergence. They have been genetically classified into four clusters on the basis of sequence homology in the coding region of the single-stranded RNA genome. To explore further the genetic relationships between human enteroviruses and to characterize the evolutionary mechanisms responsible for variation, previously sequenced genomes were subjected to detailed comparison. Bootstrap and genetic similarity analyses were used to systematically scan the alignments of complete genomic sequences. Bootstrap analysis provided evidence from an early recombination event at the junction of the 5' noncoding and coding regions of the progenitors of the current clusters. Analysis within the genetic clusters indicated that enterovirus prototype strains include intraspecies recombinants. Recombination breakpoints were detected in all genomic regions except the capsid protein coding region. Our results suggest that recombination is a significant and relatively frequent mechanism in the evolution of enterovirus genomes.  相似文献   

8.
Deletion of chromosome 9p21 is a crucial event for the development of several cancers including acute lymphoblastic leukemia (ALL). Double strand breaks (DSBs) triggering 9p21 deletions in ALL have been reported to occur at a few defined sites by illegitimate action of the V(D)J recombination activating protein complex. We have cloned 23 breakpoint junctions for a total of 46 breakpoints in 17 childhood ALL (9 B- and 8 T-lineages) showing different size deletions at one or both homologous chromosomes 9 to investigate which particular sequences make the region susceptible to interstitial deletion. We found that half of 9p21 deletion breakpoints were mediated by ectopic V(D)J recombination mechanisms whereas the remaining half were associated to repeated sequences, including some with potential for non-B DNA structure formation. Other mechanisms, such as microhomology-mediated repair, that are common in other cancers, play only a very minor role in ALL. Nucleotide insertions at breakpoint junctions and microinversions flanking the breakpoints have been detected at 20/23 and 2/23 breakpoint junctions, respectively, both in the presence of recombination signal sequence (RSS)-like sequences and of other unspecific sequences. The majority of breakpoints were unique except for two cases, both T-ALL, showing identical deletions. Four of the 46 breakpoints coincide with those reported in other cases, thus confirming the presence of recurrent deletion hotspots. Among the six cases with heterozygous 9p deletions, we found that the remaining CDKN2A and CDKN2B alleles were hypermethylated at CpG islands.  相似文献   

9.
Advances in high-throughput DNA sequencing technologies have determined an explosion in the number of sequenced bacterial genomes. Comparative sequence analysis frequently reveals evidences of homologous recombination occurring with different mechanisms and rates in different species, but the large-scale use of computational methods to identify recombination events is hampered by their high computational costs. Here, we propose a new method to identify recombination events in large datasets of whole genome sequences. Using a filtering procedure of the gene conservation profiles of a test genome against a panel of strains, this algorithm identifies sets of contiguous genes acquired by homologous recombination. The locations of the recombination breakpoints are determined using a statistical test that is able to account for the differences in the natural rate of evolution between different genes. The algorithm was tested on a dataset of 75 genomes of Staphylococcus aureus and 50 genomes comprising different streptococcal species, and was able to detect intra-species recombination events in S. aureus and in Streptococcus pneumoniae. Furthermore, we found evidences of an inter-species exchange of genetic material between S. pneumoniae and Streptococcus mitis, a closely related commensal species that colonizes the same ecological niche. The method has been implemented in an R package, Reco, which is freely available from supplementary material, and provides a rapid screening tool to investigate recombination on a genome-wide scale from sequence data.  相似文献   

10.
Noroviruses are single-stranded RNA viruses with high genomic variability. They have emerged in the last decade as a major cause of acute gastroenteritis. It remains so far unclear whether norovirus evolution is driven by sequence mutation and/or recombination. In this study, we have assessed the occurrence of recombination in the norovirus capsid gene. For this purpose, 69 complete capsid sequences of norovirus strains accessible in GenBank as well as 25 complete capsid sequences generated from norovirus-positive clinical samples were examined. Unreported recombination was detected in about 8% of norovirus strains belonging to genetic clusters I/1 (n = 1), II/1 (n = 1), II/3 (n = 1), II/4 (n = 3), and II/5 (n = 1). Recombination breakpoints were mainly located at the interface of the putative P1-1 and P2 domains of the capsid protein and/or within the P2 domain. The recombination region displayed features such as length, sequence composition (upstream and downstream GC- and AU-rich sequences, respectively), and predicted RNA secondary structure that are characteristic of homologous recombination activators. Our results suggest that recombination in the norovirus capsid gene may naturally occur, involving capsid domains presumably exposed to immunological pressure.  相似文献   

11.
A statistical test for detecting geographic subdivision.   总被引:24,自引:0,他引:24  
A statistical test for detecting genetic differentiation of subpopulations is described that uses molecular variation in samples of DNA sequences from two or more localities. The statistical significance of the test is determined with Monte Carlo simulations. The power of the test to detect genetic differentiation in a selectively neutral Wright-Fisher island model depends on both sample size and the rates of migration, mutation, and recombination. It is found that the power of the test is substantial with samples of size 50, when 4Nm less than 10, where N is the subpopulation size and m is the fraction of migrants in each subpopulation each generation. More powerful tests are obtained with genes with recombination than with genes without recombination.  相似文献   

12.
We describe genomic structures of 59 X-chromosome segmental duplications that include the proteolipid protein 1 gene (PLP1) in patients with Pelizaeus-Merzbacher disease. We provide the first report of 13 junction sequences, which gives insight into underlying mechanisms. Although proximal breakpoints were highly variable, distal breakpoints tended to cluster around low-copy repeats (LCRs) (50% of distal breakpoints), and each duplication event appeared to be unique (100 kb to 4.6 Mb in size). Sequence analysis of the junctions revealed no large homologous regions between proximal and distal breakpoints. Most junctions had microhomology of 1-6 bases, and one had a 2-base insertion. Boundaries between single-copy and duplicated DNA were identical to the reference genomic sequence in all patients investigated. Taken together, these data suggest that the tandem duplications are formed by a coupled homologous and nonhomologous recombination mechanism. We suggest repair of a double-stranded break (DSB) by one-sided homologous strand invasion of a sister chromatid, followed by DNA synthesis and nonhomologous end joining with the other end of the break. This is in contrast to other genomic disorders that have recurrent rearrangements formed by nonallelic homologous recombination between LCRs. Interspersed repetitive elements (Alu elements, long interspersed nuclear elements, and long terminal repeats) were found at 18 of the 26 breakpoint sequences studied. No specific motif that may predispose to DSBs was revealed, but single or alternating tracts of purines and pyrimidines that may cause secondary structures were common. Analysis of the 2-Mb region susceptible to duplications identified proximal-specific repeats and distal LCRs in addition to the previously reported ones, suggesting that the unique genomic architecture may have a role in nonrecurrent rearrangements by promoting instability.  相似文献   

13.
14.
To understand molecular pathways underlying 9p21 deletions, which lead to inactivation of the p16/CDKN2A, p14/ARF, and/or p15/CDKN2B genes, in lymphoid leukemia, 30 breakpoints were cloned from 15 lymphoid leukemia cell lines. Seventeen (57%) breakpoints were mapped at five breakpoint cluster sites, BCS-LL1 to LL5, each of <15 bp. Two breakpoint cluster sites were located within the ARF and CDKN2B loci, respectively, whereas the remaining three were located >100 kb distal to the CDKN2A, ARF, and CDKN2B loci. The sequences of breakpoint junctions indicated that deletions in the 11 (73%) cell lines were mediated by illegitimate V(D)J recombination targeted at the five BCS-LL and six other sites, which contain sequences similar to recombination signal sequences for V(D)J recombination. An extrachromosomal V(D)J recombination assay indicated that BCS-LL3, at which the largest number of breakpoints (i.e. five breakpoints) was clustered, has a V(D)J recombination potential 150-fold less than the consensus recombination signal sequence. Three other BCS-LLs tested also showed V(D)J recombination potential, although it was lower than that of BCS-LL3. These results indicated that illegitimate V(D)J recombination, which was targeted at several ectopic recombination signal sequences widely distributed in 9p21, caused a large fraction of 9p21 deletions in lymphoid leukemia.  相似文献   

15.
MOTIVATION: We present a statistical method for detecting recombination, whose objective is to accurately locate the recombinant breakpoints in DNA sequence alignments of small numbers of taxa (4 or 5). Our approach explicitly models the sequence of phylogenetic tree topologies along a multiple sequence alignment. Inference under this model is done in a Bayesian way, using Markov chain Monte Carlo (MCMC). The algorithm returns the site-dependent posterior probability of each tree topology, which is used for detecting recombinant regions and locating their breakpoints. RESULTS: The method was tested on a synthetic and three real DNA sequence alignments, where it was found to outperform the established detection methods PLATO, RECPARS, and TOPAL.  相似文献   

16.
The breakpoint regions of both translocation products of the (9;22) Philadelphia translocation of CML patient 83-H84 and their normal chromosome 9 and 22 counterparts have been cloned and analysed. Southern blotting with bcr probes and DNA sequencing revealed that the breaks on chromosome 22 occurred 3' of bcr exon b3 and that the 88 nucleotides between the breakpoints in the chromosome 22 bcr region were deleted. Besides this small deletion of chromosome 22 sequences a large deletion of chromosome 9 sequences (greater than 70 kb) was observed. The chromosome 9 sequences remaining on the 9q+ chromosome (9q+ breakpoint) are located at least 100 kb upstream of the v-abl homologous c-abl exons whereas the translocated chromosome 9 sequences (22q-breakpoint) could be mapped 30 kb upstream of these c-abl sequences. The breakpoints were situated in Alu-repetitive sequences either on chromosome 22 or on chromosome 9, strengthening the hypothesis that Alu-repetitive sequences can be hot spots for recombination.  相似文献   

17.
We propose a novel method for detecting sites of molecular recombination in multiple alignments. Our approach is a compromise between previous extremes of computationally prohibitive but mathematically rigorous methods and imprecise heuristic methods. Using a combined algorithm for estimating tree structure and hidden Markov model parameters, our program detects changes in phylogenetic tree topology over a multiple sequence alignment. We evaluate our method on benchmark datasets from previous studies on two recombinant pathogens, Neisseria and HIV-1, as well as simulated data. We show that we are not only able to detect recombinant regions of vastly different sizes but also the location of breakpoints with great accuracy. We show that our method does well inferring recombination breakpoints while at the same time maintaining practicality for larger datasets. In all cases, we confirm the breakpoint predictions of previous studies, and in many cases we offer novel predictions.  相似文献   

18.

Background

In prokaryotes and some eukaryotes, genetic material can be transferred laterally among unrelated lineages and recombined into new host genomes, providing metabolic and physiological novelty. Although the process is usually framed in terms of gene sharing (e.g. lateral gene transfer, LGT), there is little reason to imagine that the units of transfer and recombination correspond to entire, intact genes. Proteins often consist of one or more spatially compact structural regions (domains) which may fold autonomously and which, singly or in combination, confer the protein''s specific functions. As LGT is frequent in strongly selective environments and natural selection is based on function, we hypothesized that domains might also serve as modules of genetic transfer, i.e. that regions of DNA that are transferred and recombined between lineages might encode intact structural domains of proteins.

Methodology/Principal Findings

We selected 1,462 orthologous gene sets representing 144 prokaryotic genomes, and applied a rigorous two-stage approach to identify recombination breakpoints within these sequences. Recombination breakpoints are very significantly over-represented in gene sets within which protein domain-encoding regions have been annotated. Within these gene sets, breakpoints significantly avoid the domain-encoding regions (domons), except where these regions constitute most of the sequence length. Recombination breakpoints that fall within longer domons are distributed uniformly at random, but those that fall within shorter domons may show a slight tendency to avoid the domon midpoint. As we find no evidence for differential selection against nucleotide substitutions following the recombination event, any bias against disruption of domains must be a consequence of the recombination event per se.

Conclusions/Significance

This is the first systematic study relating the units of LGT to structural features at the protein level. Many genes have been interrupted by recombination following inter-lineage genetic transfer, during which the regions within these genes that encode protein domains have not been preferentially preserved intact. Protein domains are units of function, but domons are not modules of transfer and recombination. Our results demonstrate that LGT can remodel even the most functionally conservative modules within genomes.  相似文献   

19.
MOTIVATION: Sequence alignments obtained using affine gap penalties are not always biologically correct, because the insertion of long gaps is over-penalised. There is a need for an efficient algorithm which can find local alignments using non-linear gap penalties. RESULTS: A dynamic programming algorithm is described which computes optimal local sequence alignments for arbitrary, monotonically increasing gap penalties, i.e. where the cost g(k) of inserting a gap of k symbols is such that g(k) >/= g(k-1). The running time of the algorithm is dependent on the scoring scheme; if the expected score of an alignment between random, unrelated sequences of lengths m, n is proportional to log mn, then with one exception, the algorithm has expected running time O(mn). Elsewhere, the running time is no greater than O(mn(m+n)). Optimisations are described which appear to reduce the worst-case run-time to O(mn) in many cases. We show how using a non-affine gap penalty can dramatically increase the probability of detecting a similarity containing a long gap. AVAILABILITY: The source code is available to academic collaborators under licence.  相似文献   

20.
Simple but exact statistical tests for detecting a cluster of associated nucleotide changes in DNA are presented. The tests are based on the linear distribution of a set of s sites among a total of n sites, where the s sites may be the variable sites, sites of insertion/deletion, or categorized in some other way. These tests are especially useful for detecting gene conversion and intragenic recombination in a sample of DNA sequences. In this case, the sites of interest are those that correspond to particular ways of splitting the sequences into two groups (e.g., sequences A and D vs. sequences B, C, and E-J). Each such split is termed a phylogenetic partition. Application of these methods to a well-documented case of gene conversion in human gamma-globin genes shows that sites corresponding to two of the three observed partitions are significantly clustered, whereas application to hominoid mitochondrial DNA sequences--among which no recombination is expected to occur--shows no evidence of such clustering. This indicates that clustering of partition-specific sites is largely due to intragenic recombination or gene conversion. Alternative hypotheses explaining the observed clustering of sites, such as biased selection or mutation, are discussed.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号