共查询到20条相似文献,搜索用时 0 毫秒
1.
Patterns of linkage disequilibrium, homoplasy, and incompatibility are difficult to interpret because they depend on several factors, including the recombination process and the population structure. Here we introduce a novel model-based framework to infer recombination properties from such summary statistics in bacterial genomes. The underlying model is sequentially Markovian so that data can be simulated very efficiently, and we use approximate Bayesian computation techniques to infer parameters. As this does not require us to calculate the likelihood function, the model can be easily extended to investigate less probed aspects of recombination. In particular, we extend our model to account for the bias in the recombination process whereby closely related bacteria recombine more often with one another. We show that this model provides a good fit to a data set of Bacillus cereus genomes and estimate several recombination properties, including the rate of bias in recombination. All the methods described in this article are implemented in a software package that is freely available for download at http://code.google.com/p/clonalorigin/. 相似文献
2.
We study the detection of mutations, sequencing errors, and homologous recombination events (HREs) in a set of closely related microbial genomes. We base the model on single nucleotide polymorphisms (SNPs) and break the genomes into blocks to handle the rearrangement problem. Then we apply a dynamic programming algorithm to model whether changes within each block are likely a result of mutations, sequencing errors, or HREs. Results from simulation experiments show that we can detect 31%–61% of HREs and the precision of our detection is about 48%–90% depending on the rates of mutation and missing data. The HREfinder software for predicting HREs in a set of whole genomes is available as open source (http://sourceforge.net/projects/hrefinder/). 相似文献
3.
Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools. 相似文献
4.
5.
A novel procedure was used for cloning large adenovirus genome fragment by the homologous recombination in E.coli strain BJ5183. The 11.2Kb downstream fragment of the CAV-2 strain YCA18 genome was cloned by homologous recombination, the 1029bp left end and the 970bp fight end of this fragment were separately amplified by PCR. They were then cloned into plasmid pPoly2 with direction from left fragment to fight fragment, obtaining a “rescue” plasmid pT615. The pT615 was liberalized by Hind Ⅲ and PstⅠ digestion and was cotransformed with the purified CAV-2 genome which was cut by BstBI into competent E.coli strain BJ5183. Recombinant plasmids harboring the 11.2Kb downstream fragment of CAV-2 genome were obtained after bacterial intermolecular homologous recombination. The recombinant efficiency of all E.coli strains tested was 78.3%. One of the recombinant plasmids, pT618, was further identified by enzyme digestion analysis and PCR amplification. The results showed the plasmids contained the 11.2kb fragment downstream the genome of CAV-2. 相似文献
6.
Jayavel Sridhar Radhakrishnan Sabarinathan Shanmugam Siva Balan Ziauddin Ahamed Rafi Paramasamy Gunasekaran Kanagaraj Sekar 《基因组蛋白质组与生物信息学报(英文版)》2011,(Z2)
In the past few decades, scientists from all over the world have taken a keen interest in novel functional units such as small regulatory RNAs, small open reading frames, pseudogenes, transposons, integrase binding attB/attP sites, repeat elements within the bacterial intergenic regions (IGRs) and in the analysis of those junk regions for ge- nomic complexity. Here we have developed a web server, named Junker, to facilitate the in-depth analysis of IGRs for examining their length distribution, four-quadrant... 相似文献
7.
The capability and speed in generating genomic data have increased profoundly since the release of the draft human genome in 2000. Additionally, sequencing costs have continued to plummet as the next generation of highly efficient sequencing technologies (next-generation sequencing) became available and commercial facilities promote market competition. However, new challenges have emerged as researchers attempt to efficiently process the massive amounts of sequence data being generated. First, the described genome sequences are unequally distributed among the branches of bacterial life and, second, bacterial pan-genomes are often not considered when setting aims for sequencing projects. Here, we propose that scientists should be concerned with attaining an improved equal representation of most of the bacterial tree of life organisms, at the genomic level. Moreover, they should take into account the natural variation that is often observed within bacterial species and the role of the often changing surrounding environment and natural selection pressures, which is central to bacterial speciation and genome evolution. Not only will such efforts contribute to our overall understanding of the microbial diversity extant in ecosystems as well as the structuring of the extant genomes, but they will also facilitate the development of better methods for (meta)genome annotation. 相似文献
8.
9.
Although bacterial species display wide variation in their overall GC contents, the genes within a particular species' genome
are relatively similar in base composition. As a result, sequences that are novel to a bacterial genome—i.e., DNA introduced
through recent horizontal transfer—often bear unusual sequence characteristics and can be distinguished from ancestral DNA.
At the time of introgression, horizontally transferred genes reflect the base composition of the donor genome; but, over time,
these sequences will ameliorate to reflect the DNA composition of the new genome because the introgressed genes are subject
to the same mutational processes affecting all genes in the recipient genome. This process of amelioration is evident in a
large group of genes involved in host-cell invasion by enteric bacteria and can be modeled to predict the amount of time required
after transfer for foreign DNA to resemble native DNA. Furthermore, models of amelioration can be used to estimate the time
of introgression of foreign genes in a chromosome. Applying this approach to a 1.43-megabase continuous sequence, we have
calculated that the entire Escherichia coli chromosome contains more than 600 kb of horizontally transferred, protein-coding DNA. Estimates of amelioration times indicate
that this DNA has accumulated at a rate of 31 kb per million years, which is on the order of the amount of variant DNA introduced
by point mutations. This rate predicts that the E. coli and Salmonella enterica lineages have each gained and lost more than 3 megabases of novel DNA since their divergence.
Received: 7 July 1996 / Accepted: 27 September 1996 相似文献
10.
A plethora of algorithmic assemblers have been proposed for the de novo assembly of genomes, however, no individual assembler guarantees the optimal assembly for diverse species. Optimizing various parameters in an assembler is often performed in order to generate the most optimal assembly. However, few efforts have been pursued to take advantage of multiple assemblies to yield an assembly of high accuracy. In this study, we employ various state-of-the-art assemblers to generate different sets of contigs for bacterial genomes. A tool, named CISA, has been developed to integrate the assemblies into a hybrid set of contigs, resulting in assemblies of superior contiguity and accuracy, compared with the assemblies generated by the state-of-the-art assemblers and the hybrid assemblies merged by existing tools. This tool is implemented in Python and requires MUMmer and BLAST+ to be installed on the local machine. The source code of CISA and examples of its use are available at http://sb.nhri.org.tw/CISA/. 相似文献
11.
12.
An Efficient Method of Selectable Marker Gene Excision by Xer Recombination for Gene Replacement in Bacterial Chromosomes
下载免费PDF全文

A simple, effective method of unlabeled, stable gene insertion into bacterial chromosomes has been developed. This utilizes an insertion cassette consisting of an antibiotic resistance gene flanked by dif sites and regions homologous to the chromosomal target locus. dif is the recognition sequence for the native Xer site-specific recombinases responsible for chromosome and plasmid dimer resolution: XerC/XerD in Escherichia coli and RipX/CodV in Bacillus subtilis. Following integration of the insertion cassette into the chromosomal target locus by homologous recombination, these recombinases act to resolve the two directly repeated dif sites to a single site, thus excising the antibiotic resistance gene. Previous approaches have required the inclusion of exogenous site-specific recombinases or transposases in trans; our strategy demonstrates that this is unnecessary, since an effective recombination system is already present in bacteria. The high recombination frequency makes the inclusion of a counter-selectable marker gene unnecessary. 相似文献
13.
Metagenomic sequencing projects from environments dominated by a small number of species produce genome-wide population samples. We present a two-site composite likelihood estimator of the scaled recombination rate, ρ = 2Nec, that operates on metagenomic assemblies in which each sequenced fragment derives from a different individual. This new estimator properly accounts for sequencing error, as quantified by per-base quality scores, and missing data, as inferred from the placement of reads in a metagenomic assembly. We apply our estimator to data from a sludge metagenome project to demonstrate how this method will elucidate the rates of exchange of genetic material in natural microbial populations. Surprisingly, for a fixed amount of sequencing, this estimator has lower variance than similar methods that operate on more traditional population genetic samples of comparable size. In addition, we can infer variation in recombination rate across the genome because metagenomic projects sample genetic diversity genome-wide, not just at particular loci. The method itself makes no assumption specific to microbial populations, opening the door for application to any mixed population sample where the number of individuals sampled is much greater than the number of fragments sequenced. 相似文献
14.
15.
Concatamerization of Adeno-Associated Virus Circular Genomes Occurs through Intermolecular Recombination
下载免费PDF全文

Jusan Yang Weihong Zhou Yulong Zhang Terese Zidon Terry Ritchie John F. Engelhardt 《Journal of virology》1999,73(11):9468-9477
Long-term recombinant AAV (rAAV) transgene expression in muscle has been associated with the molecular conversion of single-stranded rAAV genomes to high-molecular-weight head-to-tail circular concatamers. However, the mechanisms by which these large multimeric concatamers form remain to be defined. To this end, we tested whether concatamerization of rAAV circular intermediates occurs through intra- or intermolecular mechanisms of amplification. Coinfection of the tibialis muscle of mice with rAAV alkaline phosphatase (Alkphos)- and green fluorescent protein (GFP)-encoding vectors was used to evaluate the frequency of circular concatamer formation by intermolecular recombination of independent viral genomes. The GFP shuttle vector also encoded ampicillin resistance and contained a bacterial origin of replication to allow for bacterial rescue of circular intermediates from Hirt DNA of infected muscle samples. The results demonstrated a time-dependent increase in the abundance of rescued plasmids encoding both GFP and Alkphos, which reached 33% of the total circular intermediates by 120 days postinfection. Furthermore, these large circular concatamers were capable of expressing both GFP- and Alkphos-encoding transgenes following transient transfection in cell lines. These findings demonstrate that concatamerization of AAV genomes in vivo occurs through intermolecular recombination of independent monomer circular viral genomes and suggest new viable strategies for delivering multiple DNA segments at a single locus. Such developments will expand the utility of rAAV for splicing large gene inserts or large promoter-gene combinations carried by two or more independent rAAV vectors. 相似文献
16.
Florent Lassalle Séverine Périan Thomas Bataillon Xavier Nesme Laurent Duret Vincent Daubin 《PLoS genetics》2015,11(2)
The characterization of functional elements in genomes relies on the identification of the footprints of natural selection. In this quest, taking into account neutral evolutionary processes such as mutation and genetic drift is crucial because these forces can generate patterns that may obscure or mimic signatures of selection. In mammals, and probably in many eukaryotes, another such confounding factor called GC-Biased Gene Conversion (gBGC) has been documented. This mechanism generates patterns identical to what is expected under selection for higher GC-content, specifically in highly recombining genomic regions. Recent results have suggested that a mysterious selective force favouring higher GC-content exists in Bacteria but the possibility that it could be gBGC has been excluded. Here, we show that gBGC is probably at work in most if not all bacterial species. First we find a consistent positive relationship between the GC-content of a gene and evidence of intra-genic recombination throughout a broad spectrum of bacterial clades. Second, we show that the evolutionary force responsible for this pattern is acting independently from selection on codon usage, and could potentially interfere with selection in favor of optimal AU-ending codons. A comparison with data from human populations shows that the intensity of gBGC in Bacteria is comparable to what has been reported in mammals. We propose that gBGC is not restricted to sexual Eukaryotes but also widespread among Bacteria and could therefore be an ancestral feature of cellular organisms. We argue that if gBGC occurs in bacteria, it can account for previously unexplained observations, such as the apparent non-equilibrium of base substitution patterns and the heterogeneity of gene composition within bacterial genomes. Because gBGC produces patterns similar to positive selection, it is essential to take this process into account when studying the evolutionary forces at work in bacterial genomes. 相似文献
17.
RNA-Z曲线及其在病毒基因识别中的应用 总被引:2,自引:0,他引:2
20世纪90年代中期提出的Z曲线方法从几何学的角度阐明了如何识别基因,并取得了非常好的实验结果.但它是完全基于DNA序列结构构建的,对于识别RNA病毒基因效果并不理想,本文提出的RNA—Z曲线方法则弥补了这一缺陷. 相似文献
18.
Mark Lipson Po-Ru Loh Sriram Sankararaman Nick Patterson Bonnie Berger David Reich 《PLoS genetics》2015,11(11)
The human mutation rate is an essential parameter for studying the evolution of our species, interpreting present-day genetic variation, and understanding the incidence of genetic disease. Nevertheless, our current estimates of the rate are uncertain. Most notably, recent approaches based on counting de novo mutations in family pedigrees have yielded significantly smaller values than classical methods based on sequence divergence. Here, we propose a new method that uses the fine-scale human recombination map to calibrate the rate of accumulation of mutations. By comparing local heterozygosity levels in diploid genomes to the genetic distance scale over which these levels change, we are able to estimate a long-term mutation rate averaged over hundreds or thousands of generations. We infer a rate of 1.61 ± 0.13 × 10−8 mutations per base per generation, which falls in between phylogenetic and pedigree-based estimates, and we suggest possible mechanisms to reconcile our estimate with previous studies. Our results support intermediate-age divergences among human populations and between humans and other great apes. 相似文献
19.
Greenspan G Geiger D Gotch F Bower M Patterson S Nelson M Gazzard B Stebbing J 《Journal of molecular evolution》2004,58(3):239-251
An emergent problem in the study of pathogen evolution is our ability to determine the extent to which their rapidly evolving genomes recombine. Such information is necessary and essential for locating pathogenicity loci using association studies, and it also directs future screening, therapeutic and vaccination strategies. Recombination also complicates the use of phylogenetic approaches to infer evolutionary parameters including selection pressures. Reliable methods that identify the presence of regions of recombination are therefore vital. We illustrate the use of an integrated model-based approach to inferring recombination structure using all available sequences of the highly variable, transforming Kaposis sarcoma-associated herpesviral gene, ORF-K1. This technique learns the parameters of a statistical model that takes recombination hotspots, population genetic effects, and variable rates of mutation into account. As there are no known mechanisms to explain the high mutation rate in this DNA viral gene, recombination may account for some of the variability observed. We infer recombination hotspots in conserved sites such as the tyrosine kinase signaling motif, referred to here as recombination drift, as well as in nonconserved sites, a process described as recombination shift.This article contains online supplementary material. 相似文献