首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Inference of bacterial microevolution using multilocus sequence data   总被引:5,自引:0,他引:5  
Didelot X  Falush D 《Genetics》2007,175(3):1251-1266
We describe a model-based method for using multilocus sequence data to infer the clonal relationships of bacteria and the chromosomal position of homologous recombination events that disrupt a clonal pattern of inheritance. The key assumption of our model is that recombination events introduce a constant rate of substitutions to a contiguous region of sequence. The method is applicable both to multilocus sequence typing (MLST) data from a few loci and to alignments of multiple bacterial genomes. It can be used to decide whether a subset of isolates share common ancestry, to estimate the age of the common ancestor, and hence to address a variety of epidemiological and ecological questions that hinge on the pattern of bacterial spread. It should also be useful in associating particular genetic events with the changes in phenotype that they cause. We show that the model outperforms existing methods of subdividing recombinogenic bacteria using MLST data and provide examples from Salmonella and Bacillus. The software used in this article, ClonalFrame, is available from http://bacteria.stats.ox.ac.uk/.  相似文献   

3.
Species evolutionary relationships have traditionally been defined by sequence similarities of phylogenetic marker molecules, recently followed by whole-genome phylogenies based on gene order, average ortholog similarity or gene content. Here, we introduce genome conservation--a novel metric of evolutionary distances between species that simultaneously takes into account, both gene content and sequence similarity at the whole-genome level. Genome conservation represents a robust distance measure, as demonstrated by accurate phylogenetic reconstructions. The genome conservation matrix for all presently sequenced organisms exhibits a remarkable ability to define evolutionary relationships across all taxonomic ranges. An assessment of taxonomic ranks with genome conservation shows that certain ranks are inadequately described and raises the possibility for a more precise and quantitative taxonomy in the future. All phylogenetic reconstructions are available at the genome phylogeny server: .  相似文献   

4.
MOTIVATION: Horizontal gene transfer (HGT) is believed to be ubiquitous among bacteria, and plays a major role in their genome diversification as well as their ability to develop resistance to antibiotics. In light of its evolutionary significance and implications for human health, developing accurate and efficient methods for detecting and reconstructing HGT is imperative. RESULTS: In this article we provide a new HGT-oriented likelihood framework for many problems that involve phylogeny-based HGT detection and reconstruction. Beside the formulation of various likelihood criteria, we show that most of these problems are NP-hard, and offer heuristics for efficient and accurate reconstruction of HGT under these criteria. We implemented our heuristics and used them to analyze biological as well as synthetic data. In both cases, our criteria and heuristics exhibited very good performance with respect to identifying the correct number of HGT events as well as inferring their correct location on the species tree. AVAILABILITY: Implementation of the criteria as well as heuristics and hardness proofs are available from the authors upon request. Hardness proofs can also be downloaded at http://www.cs.tau.ac.il/~tamirtul/MLNET/Supp-ML.pdf  相似文献   

5.
How much horizontal gene transfer (HGT) between species influences bacterial phylogenomics is a controversial issue. This debate, however, lacks any quantitative assessment of the impact of HGT on phylogenies and of the ability of tree-building methods to cope with such events. I introduce a Markov model of genome evolution with HGT, accounting for the constraints on time -- an HGT event can only occur between concomitantly living species. This model is used to simulate multigene sequence data sets with or without HGT. The consequences of HGT on phylogenomic inference are analyzed and compared to other well-known phylogenetic artefacts. It is found that supertree methods are quite robust to HGT, keeping high levels of performance even when gene trees are largely incongruent with each other. Gene tree incongruence per se is not indicative of HGT. HGT, however, removes the (otherwise observed) positive relationship between sequence length and gene tree congruence to the estimated species tree. Surprisingly, when applied to a bacterial and a eukaryotic multigene data set, this criterion rejects the HGT hypothesis for the former, but not the latter data set.  相似文献   

6.
Motif3D is a web-based protein structure viewer designed to allow sequence motifs, and in particular those contained in the fingerprints of the PRINTS database, to be visualised on three-dimensional (3D) structures. Additional functionality is provided for the rhodopsin-like G protein-coupled receptors, enabling fingerprint motifs of any of the receptors in this family to be mapped onto the single structure available, that of bovine rhodopsin. Motif3D can be used via the web interface available at: http://www.bioinf.man.ac.uk/dbbrowser/motif3d/motif3d.html.  相似文献   

7.
8.
Sequence analysis of the group of proteins known to be associated with hereditary diseases allows the detection of key distinctive features shared within this group. The disease proteins are characterized by greater length of their amino acid sequence, a broader phylogenetic extent, and specific conservation and paralogy profiles compared with all human proteins. This unique property pattern provides insights into the global nature of hereditary diseases and moreover can be used to predict novel disease genes. We have developed a computational method that allows the detection of genes likely to be involved in hereditary disease in the human genome. The probability score assignments for the human genome are accessible at http://maine.ebi. ac.uk:8000/services/dgp.  相似文献   

9.
MOTIVATION: The best quality multiple sequence alignments are generally considered to derive from structural superposition. However, no previous work has studied the relative performance of profile hidden Markov models (HMMs) derived from such alignments. Therefore several alignment methods have been used to generate multiple sequence alignments from 348 structurally aligned families in the HOMSTRAD database. The performance of profile HMMs derived from the structural and sequence-based alignments has been assessed for homologue detection. RESULTS: The best alignment methods studied here correctly align nearly 80% of residues with respect to structure alignments. Alignment quality and model sensitivity are found to be dependent on average number, length, and identity of sequences in the alignment. The striking conclusion is that, although structural data may improve the quality of multiple sequence alignments, this does not add to the ability of the derived profile HMMs to find sequence homologues. SUPPLEMENTARY INFORMATION: A list of HOMSTRAD families used in this study and the corresponding Pfam families is available at http://www.sanger.ac.uk/Users/sgj/alignments/map.html Contact: sgj@sanger.ac.uk  相似文献   

10.

Background  

Horizontal gene transfer (HGT) has allowed bacteria to evolve many new capabilities. Because transferred genes perform many medically important functions, such as conferring antibiotic resistance, improved detection of horizontally transferred genes from sequence data would be an important advance. Existing sequence-based methods for detecting HGT focus on changes in nucleotide composition or on differences between gene and genome phylogenies; these methods have high error rates.  相似文献   

11.
MOTIVATION: How critical is the sequence order information in predicting protein secondary structure segments? We tried to get a rough insight on it from a theoretical approach using both a prediction algorithm and structural fragments from Protein Databank (PDB). RESULTS: Using reverse protein sequences and PDB structural fragments, we theoretically estimated the significance of the order for protein secondary structure and prediction. On average: (1) 79% of protein sequence segments resulted in the same prediction in both normal and reverse directions, which indicated a relatively high conservation of secondary structure propensity in the reverse direction; (2) the reversed sequence prediction alone performed less accurately than the normal forward sequence prediction, but comparably high (2% difference); (3) the commonly predicted regions showed a slightly higher prediction accuracy (4%) than the normal sequences prediction; and (4) structural fragments which have counterparts in reverse direction in the same protein showed a comparable degree of secondary structure conservation (73% identity with reversed structures on average for pentamers). CONTACT: jong@biosophy.org; dietmann@ebi.ac.uk; heger@ebi.ac.uk; holm@ebi.ac.uk  相似文献   

12.
MOTIVATION: Recombination can be a prevailing drive in shaping genome evolution. RAT (Recombination Analysis Tool) is a Java-based tool for investigating recombination events in any number of aligned sequences (protein or DNA) of any length (short viral sequences to full genomes). It is an uncomplicated and intuitive application and allows the user to view only the regions of sequence alignments they are interested in. RESULTS: RAT was applied to viral sequences. Its utility was demonstrated through the detection of a known recombinant of HIV and a detailed analysis of Noroviruses, the most common cause of viral gastroenteritis in humans. AVAILABILITY: RAT, along with a user's guide, is freely available from http://jic-bioinfo.bbsrc.ac.uk/bioinformatics-research/staff/graham_etherington/RAT.htm.  相似文献   

13.
Integrative and conjugative elements (ICEs) are self-mobile genetic elements found in the genomes of some bacteria. These elements may confer a fitness advantage upon their host bacteria through the cargo genes that they carry. Salmonella pathogenicity island 7 (SPI-7), found within some pathogenic strains of Salmonella enterica, possesses features indicative of an ICE and carries genes implicated in virulence. We aimed to identify and fully analyze ICEs related to SPI-7 within the genus Salmonella and other Enterobacteriaceae. We report the sequence of two novel SPI-7-like elements, found within strains of Salmonella bongori, which share 97% nucleotide identity over conserved regions with SPI-7 and with each other. Although SPI-7 within Salmonella enterica serovar Typhi appears to be fixed within the chromosome, we present evidence that these novel elements are capable of excision and self-mobility. Phylogenetic analyses show that these Salmonella mobile elements share an ancestor which existed approximately 3.6 to 15.8 million years ago. Additionally, we identified more distantly related ICEs, with distinct cargo regions, within other strains of Salmonella as well as within Citrobacter, Erwinia, Escherichia, Photorhabdus, and Yersinia species. In total, we report on a collection of 17 SPI-7 related ICEs within enterobacterial species, of which six are novel. Using comparative and mutational studies, we have defined a core of 27 genes essential for conjugation. We present a growing family of SPI-7-related ICEs whose mobility, abundance, and cargo variability indicate that these elements may have had a large impact on the evolution of the Enterobacteriaceae.  相似文献   

14.
D Xiong  F Xiao  L Liu  K Hu  Y Tan  S He  X Gao 《PloS one》2012,7(8):e43126

Background

Horizontal gene transfer (HGT) is one of the major mechanisms contributing to microbial genome diversification. A number of computational methods for finding horizontally transferred genes have been proposed in the past decades; however none of them has provided a reliable detector yet. In existing parametric approaches, only one single compositional property can participate in the detection process, or the results obtained through each single property are just simply combined. It’s known that different properties may mean different information, so the single property can’t sufficiently contain the information encoded by gene sequences. In addition, the class imbalance problem in the datasets, which also results in great errors for the gene detection, hasn’t been considered by the published methods. Here we developed an effective classifier system (Hgtident) that used support vector machine (SVM) by combining unusual properties effectively for HGT detection.

Results

Our approach Hgtident includes the introduction of more representative datasets, optimization of SVM model, feature selection, handling of imbalance problem in the datasets and extensive performance evaluation via systematic cross-validation methods. Through feature selection, we found that JS-DN and JS-CB have higher discriminating power for HGT detection, while GC1–GC3 and k-mer (k = 1, 2, …, 7) make the least contribution. Extensive experiments indicated the new classifier could reduce Mean error dramatically, and also improve Recall by a certain level. For the testing genomes, compared with the existing popular multiple-threshold approach, on average, our Recall and Mean error was respectively improved by 2.81% and reduced by 26.32%, which means that numerous false positives were identified correctly.

Conclusions

Hgtident introduced here is an effective approach for better detecting HGT. Combining multiple features of HGT is also essential for a wider range of HGT events detection.  相似文献   

15.
Salmonella enterica has two pathogenicity islands encoding separate type three secretion systems (T3SS). Proteins secreted through these systems facilitate invasion and survival. After entry, Salmonella reside within a membrane bound vacuole, the Salmonella containing vacuole (SCV), where translocation of a second set of effectors by the Salmonella pathogenicity island 2 (SPI-2) T3SS is initiated. SPI-2 secretion in vitro can be induced by conditions that mimic the Salmonella containing vacuole. Utilising high-throughput mass spectrometry, we mapped the surface-attached proteome of S. Typhimurium SL1344 grown in vitro under SPI-2-inducing conditions and identified 108 proteins; using secretion signal prediction software, 43% of proteins identified contained a signal sequence. Of these proteins, 13 were known secreted effector proteins including SPI-2 effector proteins SseB, SseC, SseD, SseL, PipB2 and SteC, although surprisingly five were SPI-1 proteins, SipA, SipB, SipC, SipD and SopD, while 2 proteins SteA and SlrP are secreted by both T3SSs. This is the first in vitro study to demonstrate dual secretion of SPI-1 and SPI-2 proteins by S. Typhimurium and demonstrates the potential of high-throughput LC-ESI/MS/MS sequencing for the identification of novel proteins, providing a platform for subsequent comparative proteomic analysis, which should greatly assist understanding of the pathogenesis and inherent variation between serovars of Salmonella and ultimately help towards development of novel control strategies.  相似文献   

16.
17.
Salmonella enterica is a bacterial pathogen of humans that can proliferate within epithelial cells as well as professional phagocytes of the immune system. This ability requires an S. enterica specific locus termed Salmonella pathogenicity island 2 (SPI-2). SPI-2 encodes a type III secretion system that injects effectors encoded within the island into host cell cytosol to promote virulence. SsrAB is a two-component regulator encoded within SPI-2 that was assumed to activate SPI-2 genes exclusively. Here, it is shown that SsrB in fact activates a global regulon. At least 10 genes outside SPI-2 are SsrB regulated within epithelial and macrophage cells. Nine of these 10 SsrB-regulated genes outside SPI-2 reside within previously undescribed regions of the Salmonella genome. Most share no sequence homology with current database entries. However, one is remarkably homologous to human glucosyl ceramidase, an enzyme involved in the ceramide signalling pathway. The SsrB regulon is modulated by the two-component regulatory systems PhoP/PhoQ and OmpR/EnvZ, and is upregulated in the intracellular microenvironment.  相似文献   

18.
Prokaryotic organisms share genetic material across species boundaries by means of a process known as horizontal gene transfer (HGT). This process has great significance for understanding prokaryotic genome diversification and unraveling their complexities. Phylogeny-based detection of HGT is one of the most commonly used methods for this task, and is based on the fundamental fact that HGT may cause gene trees to disagree with one another, as well as with the species phylogeny. Using these methods, we can compare gene and species trees, and infer a set of HGT events to reconcile the differences among these trees. In this paper, we address three factors that confound the detection of the true HGT events, including the donors and recipients of horizontally transferred genes. First, we study experimentally the effects of error in the estimated gene trees (statistical error) on the accuracy of inferred HGT events. Our results indicate that statistical error leads to overestimation of the number of HGT events, and that HGT detection methods should be designed with unresolved gene trees in mind. Second, we demonstrate, both theoretically and empirically, that based on topological comparison alone, the number of HGT scenarios that reconcile a pair of species/gene trees may be exponential. This number may be reduced when branch lengths in both trees are estimated correctly. This set of results implies that in the absence of additional biological information, and/or a biological model of how HGT occurs, multiple HGT scenarios must be sought, and efficient strategies for how to enumerate such solutions must be developed. Third, we address the issue of lineage sorting, how it confounds HGT detection, and how to incorporate it with HGT into a single stochastic framework that distinguishes between the two events by extending population genetics theories. This result is very important, particularly when analyzing closely related organisms, where coalescent effects may not be ignored when reconciling gene trees. In addition to these three confounding factors, we consider the problem of enumerating all valid coalescent scenarios that constitute plausible species/gene tree reconciliations, and develop a polynomial-time dynamic programming algorithm for solving it. This result bears great significance on reducing the search space for heuristics that seek reconciliation scenarios. Finally, we show, empirically, that the locality of incongruence between a pair of trees has an impact on the numbers of HGT and coalescent reconciliation scenarios.  相似文献   

19.

Background

Horizontal gene transfer (HGT) has been widely identified in complete prokaryotic genomes. However, the roles of HGT among members of a microbial community and in evolution remain largely unknown. With the emergence of metagenomics, it is nontrivial to investigate such horizontal flow of genetic materials among members in a microbial community from the natural environment. Because of the lack of suitable methods for metagenomics gene transfer detection, microorganisms from a low-complexity community acid mine drainage (AMD) with near-complete genomes were used to detect possible gene transfer events and suggest the biological significance.

Results

Using the annotation of coding regions by the current tools, a phylogenetic approach, and an approximately unbiased test, we found that HGTs in AMD organisms are not rare, and we predicted 119 putative transferred genes. Among them, 14 HGT events were determined to be transfer events among the AMD members. Further analysis of the 14 transferred genes revealed that the HGT events affected the functional evolution of archaea or bacteria in AMD, and it probably shaped the community structure, such as the dominance of G-plasma in archaea in AMD through HGT.

Conclusions

Our study provides a novel insight into HGT events among microorganisms in natural communities. The interconnectedness between HGT and community evolution is essential to understand microbial community formation and development.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1720-0) contains supplementary material, which is available to authorized users.  相似文献   

20.
RALEE--RNA ALignment editor in Emacs   总被引:5,自引:0,他引:5  
SUMMARY: Production of high quality multiple sequence alignments of structured RNAs relies on an iterative combination of manual editing and structure prediction. An essential feature of an RNA alignment editor is the facility to mark-up the alignment based on how it matches a given secondary structure prediction, but few available alignment editors offer such a feature. The RALEE (RNA ALignment Editor in Emacs) tool provides a simple environment for RNA multiple sequence alignment editing, including structure-specific colour schemes, utilizing helper applications for structure prediction and many more conventional editing functions. This is accomplished by extending the commonly used text editor, Emacs, which is available for Linux, most UNIX systems, Windows and Mac OS. AVAILABILITY: The ELISP source code for RALEE is freely available from http://www.sanger.ac.uk/Users/sgj/ralee/ along with documentation and examples. CONTACT: sgj@sanger.ac.uk  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号