首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Since recombination leads to the generation of mosaic genomes that violate the assumption of traditional phylogenetic methods that sequence evolution can be accurately described by a single tree, results and conclusions based on phylogenetic analysis of data sets including recombinant sequences can be severely misleading. Many methods are able to adequately detect recombination between diverse sequences, for example between different HIV-1 subtypes. More problematic is the identification of recombinants among closely related sequences such as a viral population within a host. We describe a simple algorithmic procedure that enables detection of intra-host recombinants based on split-decomposition networks and a robust statistical test for recombination. By applying this algorithm to several published HIV-1 data sets we conclude that intra-host recombination was significantly underestimated in previous studies and that up to one-third of the env sequences longitudinally sampled from a given subject can be of recombinant origin. The results show that our procedure can be a valuable exploratory tool for detection of recombinant sequences before phylogenetic analysis, and also suggest that HIV-1 recombination in vivo is far more frequent and significant than previously thought.  相似文献   

2.
Phylogenetic analyses frequently rely on models of sequence evolution that detail nucleotide substitution rates, nucleotide frequencies, and site-to-site rate heterogeneity. These models can influence hypothesis testing and can affect the accuracy of phylogenetic inferences. Maximum likelihood methods of simultaneously constructing phylogenetic tree topologies and estimating model parameters are computationally intensive, and are not feasible for sample sizes of 25 or greater using personal computers. Techniques that initially construct a tree topology and then use this non-maximized topology to estimate ML substitution rates, however, can quickly arrive at a model of sequence evolution. The accuracy of this two-step estimation technique was tested using simulated data sets with known model parameters. The results showed that for a star-like topology, as is often seen in human immunodeficiency virus type 1 (HIV-1) subtype B sequences, a random starting topology could produce nucleotide substitution rates that were not statistically different than the true rates. Samples were isolated from 100 HIV-1 subtype B infected individuals from the United States and a 620 nt region of the env gene was sequenced for each sample. The sequence data were used to obtain a substitution model of sequence evolution specific for HIV-1 subtype B env by estimating nucleotide substitution rates and the site-to-site heterogeneity in 100 individuals from the United States. The method of estimating the model should provide users of large data sets with a way to quickly compute a model of sequence evolution, while the nucleotide substitution model we identified should prove useful in the phylogenetic analysis of HIV-1 subtype B env sequences. Received: 4 October 2000 / Accepted: 1 March 2001  相似文献   

3.
1. Observations of different organisms can often be used to infer environmental conditions at a site. These inferences may be useful for diagnosing the causes of degradation in streams and rivers. 2. When used for diagnosis, biological inferences must not only provide accurate, unbiased predictions of environmental conditions, but also pairs of inferred environmental variables must covary no more strongly than actual measurements of those same environmental variables. 3. Mathematical analysis of the relationship between the measured and inferred values of different environmental variables provides an approach for comparing the covariance between measurements with the covariance between inferences. Then, simulated and field‐collected data are used to assess the performance of weighted average and maximum likelihood inference methods. 4. Weighted average inferences became less accurate as covariance in the calibration data increased, whereas maximum likelihood inferences were unaffected by covariance in the calibration data. In contrast, the accuracy of weighted average inferences was unaffected by changes in measurement error, whilst the accuracy of maximum likelihood inferences decreased as measurement error increased. Weighted average inferences artificially increased the covariance of environmental variables beyond what was expected from measurements, whereas maximum likelihood inference methods more accurately reproduced the expected covariances. 5. Multivariate maximum likelihood inference methods can potentially provide more useful diagnostic information than single variable inference models.  相似文献   

4.
Retroviral recombinants result from template switching between copackaged viral genomes. Here, marker reassortment between coexpressed vectors was measured during single replication cycles, and human immunodeficiency virus type 1 (HIV-1) recombination was observed six- to sevenfold more frequently than murine leukemia virus (MLV) recombination. Template switching was also assayed by using transduction-type vectors in which donor and acceptor template regions were joined covalently. In this situation, where RNA copackaging could not vary, MLV and HIV-1 template switching rates were indistinguishable. These findings argue that MLV's lower intermolecular recombination frequency does not reflect enzymological differences. Instead, these data suggest that recombination rates differ because coexpressed MLV RNAs are less accessible to the recombination machinery than are coexpressed HIV RNAs. This hypothesis provides a plausible explanation for why most gammaretrovirus recombinants, although relatively rare, display evidence of multiple nonselected crossovers. By implying that recombinogenic template switching occurs roughly four times on average during the synthesis of every MLV or HIV-1 DNA, these results suggest that virtually all products of retroviral replication are biochemical recombinants.  相似文献   

5.
It is difficult to directly observe processes like natural selection at the genetic level, but relatively easy to estimate genetic frequencies in populations. As a result, genetic frequency data are widely used to make inferences about the underlying evolutionary processes. However, multiple processes can generate the same patterns of frequency data, making such inferences weak. By studying the limits to the underlying processes, one can make inferences from frequency data by asking how strong selection (or some other process of interest) would have to be to generate the observed pattern. Here we present results of a study of the limits to the relationship between selection and recombination in two-locus, two-allele systems in which we found the limiting relationships for over 30 000 sets of parameters, effectively covering the range of two-locus, two-allele problems. Our analysis relates T min—the minimum time for a population to evolve from the initial to the final conditions—to the strengths of selection and recombination, the amount of linkage disequilibrium, and the Nei distance between the initial and final conditions. T min can be large with either large disequilibrium and small Nei distance, or the reverse. The behavior of T min provides information about the limiting relationships between selection and recombination. Our methods allow evolutionary inferences from frequency data when deterministic processes like selection and recombination are operating; in this sense they complement methods based entirely on drift.  相似文献   

6.
Patterns of genetic variation in natural populations are shaped by, and hence carry valuable information about, the underlying recombination process. In the past five years, the increasing availability of large-scale population genetic data on dense sets of markers, coupled with advances in statistical methods for extracting information from these data, have led to several important advances in our understanding of the recombination process in humans. These advances include the identification of large numbers of 'hotspots', where recombination appears to take place considerably more frequently than in the surrounding sequence, and the identification of DNA sequence motifs that are associated with the locations of these hotspots.  相似文献   

7.
Human immunodeficiency virus (HIV) infects different organs and tissues. During these infection events, subpopulations of HIV type 1 (HIV-1) develop and, if viral trafficking is restricted between subpopulations, the viruses can follow independent evolutionary histories, i.e., become compartmentalized. This phenomenon is usually detected via comparative sequence analysis and has been reported for viruses isolated from the central nervous system (CNS) and the genital tract. Several approaches have been proposed to study the compartmentalization of HIV sequences, but to date, no rigorous comparison of the most commonly employed methods has been made. In this study, we systematically compared inferences made by six different methods for detecting compartmentalization based on three data sets: (i) a sample of 45 patients with sequences gathered from the CNS, (ii) sequences from the female genital tract of 18 patients, and (iii) a set of simulated sequences. We found that different methods often reached contradictory conclusions. Methods based on the topology of a phylogenetic tree derived from clonal sequences were generally more sensitive in detecting compartmentalization than those that relied solely upon pairwise genetic distances between sequences. However, as the branching structure in a phylogenetic tree is often uncertain, especially for short, low-diversity, or recombinant sequences, tree-based approaches may need to be modified to take phylogenetic uncertainty into account. Given the frequently discordant predictions of different methods and the strengths and weaknesses of each particular methodology, we recommend that a suite of several approaches be used for reliable inference of compartmentalized population structure.  相似文献   

8.
J S Lopes  M Arenas  D Posada  M A Beaumont 《Heredity》2014,112(3):255-264
The estimation of parameters in molecular evolution may be biased when some processes are not considered. For example, the estimation of selection at the molecular level using codon-substitution models can have an upward bias when recombination is ignored. Here we address the joint estimation of recombination, molecular adaptation and substitution rates from coding sequences using approximate Bayesian computation (ABC). We describe the implementation of a regression-based strategy for choosing subsets of summary statistics for coding data, and show that this approach can accurately infer recombination allowing for intracodon recombination breakpoints, molecular adaptation and codon substitution rates. We demonstrate that our ABC approach can outperform other analytical methods under a variety of evolutionary scenarios. We also show that although the choice of the codon-substitution model is important, our inferences are robust to a moderate degree of model misspecification. In addition, we demonstrate that our approach can accurately choose the evolutionary model that best fits the data, providing an alternative for when the use of full-likelihood methods is impracticable. Finally, we applied our ABC method to co-estimate recombination, substitution and molecular adaptation rates from 24 published human immunodeficiency virus 1 coding data sets.  相似文献   

9.
The composite-likelihood estimator (CLE) of the population recombination rate considers only sites with exactly two alleles under a finite-sites mutation model (McVean, G. A. T., P. Awadalla, and P. Fearnhead. 2002. A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160:1231-1241). While in such a model the identity of alleles is not considered, the CLE has been shown to be robust to minor misspecification of the underlying mutational model. However, there are many situations where the putative mutation and demographic history can be quite complex. One good example is rapidly evolving pathogens, like HIV-1. First we evaluated the performance of the CLE and the likelihood permutation test (LPT) under more complex, realistic models, including a general time reversible (GTR) substitution model, rate heterogeneity among sites (Gamma), positive selection, population growth, population structure, and noncontemporaneous sampling. Second, we relaxed some of the assumptions of the CLE allowing for a four-allele, GTR + Gamma model in an attempt to use the data more efficiently. Through simulations and the analysis of real data, we concluded that the CLE is robust to severe misspecifications of the substitution model, but underestimates the recombination rate in the presence of exponential growth, population mixture, selection, or noncontemporaneous sampling. In such cases, the use of more complex models slightly increases performance in some occasions, especially in the case of the LPT. Thus, our results provide for a more robust application of the estimation of recombination rates.  相似文献   

10.
The study estimated the prevalence of HIV-1 intra-subtype recombinant variants among female bar and hotel workers in Tanzania. While intra-subtype recombination occurs in HIV-1, it is generally underestimated. HIV-1 env gp120 V1-C5 quasispecies from 45 subjects were generated by single-genome amplification and sequencing (median (IQR) of 38 (28–50) sequences per subject). Recombination analysis was performed using seven methods implemented within the recombination detection program version 3, RDP3. HIV-1 sequences were considered recombinant if recombination signals were detected by at least three methods with p-values of ≤0.05 after Bonferroni correction for multiple comparisons. HIV-1 in 38 (84%) subjects showed evidence for intra-subtype recombination including 22 with HIV-1 subtype A1, 13 with HIV-1 subtype C, and 3 with HIV-1 subtype D. The distribution of intra-patient recombination breakpoints suggested ongoing recombination and showed selective enrichment of recombinant variants in 23 (60%) subjects. The number of subjects with evidence of intra-subtype recombination increased from 29 (69%) to 36 (82%) over one year of follow-up, although the increase did not reach statistical significance. Adjustment for intra-subtype recombination is important for the analysis of multiplicity of HIV infection. This is the first report of high prevalence of intra-subtype recombination in the HIV/AIDS epidemic in Tanzania, a region where multiple HIV-1 subtypes co-circulate. HIV-1 intra-subtype recombination increases viral diversity and presents additional challenges for HIV-1 vaccine design.  相似文献   

11.
12.
13.
Consequences of recombination on traditional phylogenetic analysis   总被引:38,自引:0,他引:38  
Schierup MH  Hein J 《Genetics》2000,156(2):879-891
We investigate the shape of a phylogenetic tree reconstructed from sequences evolving under the coalescent with recombination. The motivation is that evolutionary inferences are often made from phylogenetic trees reconstructed from population data even though recombination may well occur (mtDNA or viral sequences) or does occur (nuclear sequences). We investigate the size and direction of biases when a single tree is reconstructed ignoring recombination. Standard software (PHYLIP) was used to construct the best phylogenetic tree from sequences simulated under the coalescent with recombination. With recombination present, the length of terminal branches and the total branch length are larger, and the time to the most recent common ancestor smaller, than for a tree reconstructed from sequences evolving with no recombination. The effects are pronounced even for small levels of recombination that may not be immediately detectable in a data set. The phylogenies when recombination is present superficially resemble phylogenies for sequences from an exponentially growing population. However, exponential growth has a different effect on statistics such as Tajima's D. Furthermore, ignoring recombination leads to a large overestimation of the substitution rate heterogeneity and the loss of the molecular clock. These results are discussed in relation to viral and mtDNA data sets.  相似文献   

14.
The use of mutagenic drugs to drive HIV-1 past its error threshold presents a novel intervention strategy, as suggested by the quasispecies theory, that may be less susceptible to failure via viral mutation-induced emergence of drug resistance than current strategies. The error threshold of HIV-1, , however, is not known. Application of the quasispecies theory to determine poses significant challenges: Whereas the quasispecies theory considers the asexual reproduction of an infinitely large population of haploid individuals, HIV-1 is diploid, undergoes recombination, and is estimated to have a small effective population size in vivo. We performed population genetics-based stochastic simulations of the within-host evolution of HIV-1 and estimated the structure of the HIV-1 quasispecies and . We found that with small mutation rates, the quasispecies was dominated by genomes with few mutations. Upon increasing the mutation rate, a sharp error catastrophe occurred where the quasispecies became delocalized in sequence space. Using parameter values that quantitatively captured data of viral diversification in HIV-1 patients, we estimated to be substitutions/site/replication, ∼2–6 fold higher than the natural mutation rate of HIV-1, suggesting that HIV-1 survives close to its error threshold and may be readily susceptible to mutagenic drugs. The latter estimate was weakly dependent on the within-host effective population size of HIV-1. With large population sizes and in the absence of recombination, our simulations converged to the quasispecies theory, bridging the gap between quasispecies theory and population genetics-based approaches to describing HIV-1 evolution. Further, increased with the recombination rate, rendering HIV-1 less susceptible to error catastrophe, thus elucidating an added benefit of recombination to HIV-1. Our estimate of may serve as a quantitative guideline for the use of mutagenic drugs against HIV-1.  相似文献   

15.
Recombinant HIV-1 genomes contribute significantly to the diversity of variants within the HIV/AIDS pandemic. It is assumed that some of these mosaic genomes may have novel properties that have led to their prevalence, particularly in the case of the circulating recombinant forms (CRFs). In regions of the HIV-1 genome where recombination has a tendency to convey a selective advantage to the virus, we predict that the distribution of breakpoints--the identifiable boundaries that delimit the mosaic structure--will deviate from the underlying null distribution. To test this hypothesis, we generate a probabilistic model of HIV-1 copy-choice recombination and compare the predicted breakpoint distribution to the distribution from the HIV/AIDS pandemic. Across much of the HIV-1 genome, we find that the observed frequencies of inter-subtype recombination are predicted accurately by our model. This observation strongly indicates that in these regions a probabilistic model, dependent on local sequence identity, is sufficient to explain breakpoint locations. In regions where there is a significant over- (either side of the env gene) or under- (short regions within gag, pol, and most of env) representation of breakpoints, we infer natural selection to be influencing the recombination pattern. The paucity of recombination breakpoints within most of the envelope gene indicates that recombinants generated in this region are less likely to be successful. The breakpoints at a higher frequency than predicted by our model are approximately at either side of env, indicating increased selection for these recombinants as a consequence of this region, or at least part of it, having a tendency to be recombined as an entire unit. Our findings thus provide the first clear indication of the existence of a specific portion of the genome that deviates from a probabilistic null model for recombination. This suggests that, despite the wide diversity of recombinant forms seen in the viral population, only a minority of recombination events appear to be of significance to the evolution of HIV-1.  相似文献   

16.

Background

Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information for individual proteases and some have been applied to extract cleavage rules from data. However, the hitherto proposed methods for extracting rules have been neither easy to understand nor very accurate. To be practically useful, cleavage rules should be accurate, compact, and expressed in an easily understandable way.

Results

A new method is presented for producing cleavage rules for viral proteases with seemingly complex cleavage profiles. The method is based on orthogonal search-based rule extraction (OSRE) combined with spectral clustering. It is demonstrated on substrate data sets for human immunodeficiency virus type 1 (HIV-1) protease and hepatitis C (HCV) NS3/4A protease, showing excellent prediction performance for both HIV-1 cleavage and HCV NS3/4A cleavage, agreeing with observed HCV genotype differences. New cleavage rules (consensus sequences) are suggested for HIV-1 and HCV NS3/4A cleavages. The practical usability of the method is also demonstrated by using it to predict the location of an internal cleavage site in the HCV NS3 protease and to correct the location of a previously reported internal cleavage site in the HCV NS3 protease. The method is fast to converge and yields accurate rules, on par with previous results for HIV-1 protease and better than previous state-of-the-art for HCV NS3/4A protease. Moreover, the rules are fewer and simpler than previously obtained with rule extraction methods.

Conclusion

A rule extraction methodology by searching for multivariate low-order predicates yields results that significantly outperform existing rule bases on out-of-sample data, but are more transparent to expert users. The approach yields rules that are easy to use and useful for interpreting experimental data.  相似文献   

17.
18.
19.
Recombinant human immunodeficiency virus type 1 (HIV-1) strains containing sequences from different viral genetic subtypes (intersubtype) and different lineages from within the same subtype (intrasubtype) have been observed. A consequence of recombination can be the distortion of the phylogenetic signal. Several intersubtype recombinants have been identified; however, less is known about the frequency of intrasubtype recombination. For this study, near-full-length HIV-1 subtype C genomes from 270 individuals were evaluated for the presence of intrasubtype recombination. A sliding window schema (window, 2 kb; step, 385 bp) was used to partition the aligned sequences. The Shimodaira-Hasegawa test detected significant topological incongruence in 99.6% of the comparisons of the maximum-likelihood trees generated from each sequence partition, a result that could be explained by recombination. Using RECOMBINE, we detected significant levels of recombination using five random subsets of the sequences. With a set of 23 topologically consistent sequences used as references, bootscanning followed by the interactive informative site test defined recombination breakpoints. Using two multiple-comparison correction methods, 47% of the sequences showed significant evidence of recombination in both analyses. Estimated evolutionary rates were revised from 0.51%/year (95% confidence interval [CI], 0.39 to 0.53%) with all sequences to 0.46%/year (95% CI, 0.38 to 0.48%) with the putative recombinants removed. The timing of the subtype C epidemic origin was revised from 1961 (95% CI, 1947 to 1962) with all sequences to 1958 (95% CI, 1949 to 1960) with the putative recombinants removed. Thus, intrasubtype recombinants are common within the subtype C epidemic and these impact analyses of HIV-1 evolution.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号