首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The introduction of multilocus sequence typing (MLST) for the precise characterization of isolates of bacterial pathogens has had a marked impact on both routine epidemiological surveillance and microbial population biology. In both fields, a key prerequisite for exploiting this resource is the ability to discern the relatedness and patterns of evolutionary descent among isolates with similar genotypes. Traditional clustering techniques, such as dendrograms, provide a very poor representation of recent evolutionary events, as they attempt to reconstruct relationships in the absence of a realistic model of the way in which bacterial clones emerge and diversify to form clonal complexes. An increasingly popular approach, called BURST, has been used as an alternative, but present implementations are unable to cope with very large data sets and offer crude graphical outputs. Here we present a new implementation of this algorithm, eBURST, which divides an MLST data set of any size into groups of related isolates and clonal complexes, predicts the founding (ancestral) genotype of each clonal complex, and computes the bootstrap support for the assignment. The most parsimonious patterns of descent of all isolates in each clonal complex from the predicted founder(s) are then displayed. The advantages of eBURST for exploring patterns of evolutionary descent are demonstrated with a number of examples, including the simple Spain(23F)-1 clonal complex of Streptococcus pneumoniae, "population snapshots" of the entire S. pneumoniae and Staphylococcus aureus MLST databases, and the more complicated clonal complexes observed for Campylobacter jejuni and Neisseria meningitidis.  相似文献   

2.
To study the population genetic structure of Pseudomonas aeruginosa, we developed a multilocus sequence typing scheme. The sequences of internal fragments of seven housekeeping genes were obtained for 34 P. aeruginosa isolates from patients hospitalized in five different European cities. Twenty-six different allelic profiles were identified. The mean allelic diversity was 0.854 (range: 0.606-0.978), which was about six times greater than the results obtained with the multilocus enzyme electrophoresis method. Linkage disequilibrium was measured with the index of association. An index of 1.95+/-0.24 was calculated when all the strains were considered. This index was 1.76+/-0.27 when only one strain per sequence type was considered. Both results were different from 0, indicating linkage among loci, which means that the population structure of our set of P. aeruginosa isolates is clonal. The clonal structure of the population was also suggested by the congruence of the topology of the different trees obtained from the seven housekeeping genes. These results are in contrast to previous studies, finding a non clonal population structure. Since a small number of isolates was analyzed in this study, there might be a bias of selection which includes the possibility that they belong to widely disseminated epidemic clones. Another possibility is that recombination did not occurred homogeneously throughout the genome of P. aeruginosa, so that part of it has a clonal structure, while the remaining part of the genome is more frequently subject to recombination.  相似文献   

3.
The endosymbiotic bacterium Wolbachia enhances its spread via vertical transmission by generating reproductive effects in its hosts, most notably cytoplasmic incompatibility (CI). Additionally, frequent interspecific horizontal transfer is evident from a lack of phylogenetic congruence between Wolbachia and its hosts. The mechanisms of this lateral transfer are largely unclear. To identify potential pathways of Wolbachia movements, we performed multilocus sequence typing of Wolbachia strains from bees (Anthophila). Using a host phylogeny and ecological data, we tested various models of horizontal endosymbiont transmission. In general, Wolbachia strains seem to be randomly distributed among bee hosts. Kleptoparasite‐host associations among bees as well as other ecological links could not be supported as sole basis for the spread of Wolbachia. However, cophylogenetic analyses and divergence time estimations suggest that Wolbachia may persist within a host lineage over considerable timescales and that strictly vertical transmission and subsequent random loss of infections across lineages may have had a greater impact on Wolbachia strain distribution than previously estimated. Although general conclusions about Wolbachia movements among arthropod hosts cannot be made, we present a framework by which precise assumptions about shared evolutionary histories of Wolbachia and a host taxon can be modelled and tested.  相似文献   

4.
Despite its importance as a human pathogen, information on population structure and global epidemiology of Staphylococcus epidermidis is scarce and the relative importance of the mechanisms contributing to clonal diversification is unknown. In this study, we addressed these issues by analyzing a representative collection of S. epidermidis isolates from diverse geographic and clinical origins using multilocus sequence typing (MLST). Additionally, we characterized the mobile element (SCCmec) carrying the genetic determinant of methicillin resistance. The 217 S. epidermidis isolates from our collection were split by MLST into 74 types, suggesting a high level of genetic diversity. Analysis of MLST data using the eBURST algorithm revealed the existence of nine epidemic clonal lineages that were disseminated worldwide. One single clonal lineage (clonal complex 2) comprised 74% of the isolates, whereas the remaining isolates were clustered into 8 minor clonal lineages and 13 singletons. According to our evolutionary model, SCCmec was acquired at least 56 times by S. epidermidis. Although geographic dissemination of S. epidermidis strains and the value of the index of association between the alleles, 0.2898 (P < 0.05), support the clonality of S. epidermidis species, examination of the sequence changes at MLST loci during clonal diversification showed that recombination gives rise to new alleles approximately twice as frequently as point mutations. We suggest that S. epidermidis has a population with an epidemic structure, in which nine clones have emerged upon a recombining background and evolved quickly through frequent transfer of genetic mobile elements, including SCCmec.  相似文献   

5.
In conservation and management of species it is important to make inferences about gene flow, dispersal and population structure. In this study, we used 613 georeferenced tissue samples from hazel grouse (Bonasa bonasia) where each individual was genotyped at 12 microsatellite loci to make inference on population genetic structure, gene flow and dispersal in northern Sweden. Observed levels of genetic diversity suggest that Swedish hazel grouse do not suffer loss of genetic diversity compared with other grouse species. We found significant F(IS) (deviation from Hardy-Weinberg expectations) over the entire sample using jack-knifed estimators over loci, which is most likely explained by a Wahlund effect. With the use of spatial autocorrelation methods, we detected significant isolation by distance among individuals. Neighbourhood size was estimated in the order of 62-158 individuals corresponding to a dispersal distance of 950-1500 m. Using a spatial statistical model for landscape genetics to infer the number of populations and the spatial location of genetic discontinuities between these populations we found indications that Swedish hazel grouse are divided into a northern and a southern population. We could not find a sharp border between these two populations and none of the observed borders appeared to coincide with any potential geographical barriers.These results imply that gene flow appears somewhat unrestricted in the boreal taiga forests of northern Sweden and that the two populations of hazel grouse in Sweden may be explained by the post-glacial reinvasion history of the Scandinavian Peninsula.  相似文献   

6.
Feil EJ  Smith JM  Enright MC  Spratt BG 《Genetics》2000,154(4):1439-1450
Multilocus sequence typing (MLST) is a highly discriminatory molecular typing method that defines isolates of bacterial pathogens using the sequences of approximately 450-bp internal fragments of seven housekeeping genes. This technique has been applied to 575 isolates of Streptococcus pneumoniae and identifies a number of discrete clonal complexes. These clonal complexes are typically represented by a single group of isolates sharing identical alleles at all seven loci, plus single-locus variants that differ from this group at only one out of the seven loci. As MLST is highly discriminatory, the members of each clonal complex can be assumed to have a recent common ancestor, and the molecular events that give rise to the single-locus variants can be used to estimate the relative contributions of recombination and mutation to clonal divergence. By comparing the sequences of the variant alleles within each clonal complex with the allele typically found within that clonal complex, we estimate that recombination has generated new alleles at a frequency approximately 10-fold higher than mutation, and that a single nucleotide site is approximately 50 times more likely to change through recombination than mutation. We also demonstrate how to estimate the average length of recombinational replacements from MLST data.  相似文献   

7.
Bacillus cereus strains from cases of severe or lethal systemic infections, including respiratory symptoms cases, were analyzed using multilocus sequence typing scheme of B. cereus MLST database. The isolates were evenly distributed between the two main clades, and 60% of them had allele profiles new to the database. Half of the collection's strains clustered in a lineage neighboring Bacillus anthracis phylogenetic origin. Strains from lethal cases with respiratory symptoms were allocated in both main clades. This is the first report of strains causing respiratory symptoms to be identified as genetically distant from B. anthracis. The phylogenetic location of the presented here strains was compared with all previously submitted to the database isolates from systemic infections, and were found to appear in the same clusters where clinical isolates from other studies had been assigned. It seems that the pathogenic strains are forming clusters on the phylogenetic tree.  相似文献   

8.
Clustering analysis of SAGE data using a Poisson approach   总被引:3,自引:1,他引:2       下载免费PDF全文
Serial analysis of gene expression (SAGE) data have been poorly exploited by clustering analysis owing to the lack of appropriate statistical methods that consider their specific properties. We modeled SAGE data by Poisson statistics and developed two Poisson-based distances. Their application to simulated and experimental mouse retina data show that the Poisson-based distances are more appropriate and reliable for analyzing SAGE data compared to other commonly used distances or similarity measures such as Pearson correlation or Euclidean distance.  相似文献   

9.
Enterococcus faecalis represents recently an important etiological agent of health care-associated infections (HAIs) and there is a need for evaluation and comparison of typing methods available for this microorganism. We tested multilocus VNTR (variable-number tandem repeats) analysis (MLVA) on a well-characterized collection of 153 clinical isolates of E. faecalis, corresponding to 52 multilocus sequence types and 67 pulsed-field gel electrophoresis (PFGE) profiles. MLVA showed high discriminatory power, discerning 111 different types (diversity index equal 98.9%). The concordance MLVA/MLST and MLVA/PFGE was 0.95 and 0.74, respectively. High discriminatory power of MLVA indicates its utility for local epidemiology such as outbreak investigation, and for differentiation of clones defined by other methods.  相似文献   

10.
Inference of population structure using multilocus genotype data   总被引:243,自引:0,他引:243  
We describe a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations. We assume a model in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned (probabilistically) to populations, or jointly to two or more populations if their genotypes indicate that they are admixed. Our model does not assume a particular mutation process, and it can be applied to most of the commonly used genetic markers, provided that they are not closely linked. Applications of our method include demonstrating the presence of population structure, assigning individuals to populations, studying hybrid zones, and identifying migrants and admixed individuals. We show that the method can produce highly accurate assignments using modest numbers of loci-e.g. , seven microsatellite loci in an example using genotype data from an endangered bird species. The software used for this article is available from http://www.stats.ox.ac.uk/ approximately pritch/home. html.  相似文献   

11.
Inference of bacterial microevolution using multilocus sequence data   总被引:5,自引:0,他引:5  
Didelot X  Falush D 《Genetics》2007,175(3):1251-1266
We describe a model-based method for using multilocus sequence data to infer the clonal relationships of bacteria and the chromosomal position of homologous recombination events that disrupt a clonal pattern of inheritance. The key assumption of our model is that recombination events introduce a constant rate of substitutions to a contiguous region of sequence. The method is applicable both to multilocus sequence typing (MLST) data from a few loci and to alignments of multiple bacterial genomes. It can be used to decide whether a subset of isolates share common ancestry, to estimate the age of the common ancestor, and hence to address a variety of epidemiological and ecological questions that hinge on the pattern of bacterial spread. It should also be useful in associating particular genetic events with the changes in phenotype that they cause. We show that the model outperforms existing methods of subdividing recombinogenic bacteria using MLST data and provide examples from Salmonella and Bacillus. The software used in this article, ClonalFrame, is available from http://bacteria.stats.ox.ac.uk/.  相似文献   

12.
We compared the potential of direct genome restriction enzyme analysis (DGREA) and pulsed-field gel electrophoresis (PFGE) for discriminating Vibrio vulnificus isolates from clinical (23) and environmental (17) sources. The genotypes generated by both methodologies were compared to previous multilocus sequence typing (MLST) data. DGREA established clearer relationships among V. vulnificus strains and was more consistent with MLST than with PFGE. DGREA is a very promising tool for epidemiological and ecological studies of V. vulnificus.  相似文献   

13.
A geostatistical perspective on spatial genetic structure may explain methodological issues of quantifying spatial genetic structure and suggest new approaches to addressing them. We use a variogram approach to (i) derive a spatial partitioning of molecular variance, gene diversity, and genotypic diversity for microsatellite data under the infinite allele model (IAM) and the stepwise mutation model (SMM), (ii) develop a weighting of sampling units to reflect ploidy levels or multiple sampling of genets, and (iii) show how variograms summarize the spatial genetic structure within a population under isolation-by-distance. The methods are illustrated with data from a population of the epiphytic lichen Lobaria pulmonaria, using six microsatellite markers. Variogram-based analysis not only avoids bias due to the underestimation of population variance in the presence of spatial autocorrelation, but also provides estimates of population genetic diversity and the degree and extent of spatial genetic structure accounting for autocorrelation.  相似文献   

14.
Phylogeny estimation is extremely crucial in the study of molecular evolution. The increase in the amount of available genomic data facilitates phylogeny estimation from multilocus sequence data. Although maximum likelihood and Bayesian methods are available for phylogeny reconstruction using multilocus sequence data, these methods require heavy computation, and their application is limited to the analysis of a moderate number of genes and taxa. Distance matrix methods present suitable alternatives for analyzing huge amounts of sequence data. However, the manner in which distance methods can be applied to multilocus sequence data remains unknown. Here, we suggest new procedures to estimate molecular phylogeny using multilocus sequence data and evaluate its significance in the framework of the distance method. We found that concatenation of the multilocus sequence data may result in incorrect phylogeny estimation with an extremely high bootstrap probability (BP), which is due to incorrect estimation of the distances and intentional ignorance of the intergene variations. Therefore, we suggest that the distance matrices for multilocus sequence data be estimated separately and these matrices be subsequently combined to reconstruct phylogeny instead of phylogeny reconstruction using concatenated sequence data. To calculate the BPs of the reconstructed phylogeny, we suggest that 2-stage bootstrap procedures be adopted; in this, genes are resampled followed by resampling of the sequence columns within the resampled genes. By resampling the genes during calculation of BPs, intergene variations are properly considered. Via simulation studies and empirical data analysis, we demonstrate that our 2-stage bootstrap procedures are more suitable than the conventional bootstrap procedure that is adopted after sequence concatenation.  相似文献   

15.

Background  

Tiling array data is hard to interpret due to noise. The wavelet transformation is a widely used technique in signal processing for elucidating the true signal from noisy data. Consequently, we attempted to denoise representative tiling array datasets for ChIP-chip experiments using wavelets. In doing this, we used specific wavelet basis functions, Coiflets, since their triangular shape closely resembles the expected profiles of true ChIP-chip peaks.  相似文献   

16.
The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets.  相似文献   

17.
Zhao Y  Yu H  Zhu Y  Ter-Minassian M  Peng Z  Shen H  Diao N  Chen F 《PloS one》2012,7(2):e31134
Family based association study (FBAS) has the advantages of controlling for population stratification and testing for linkage and association simultaneously. We propose a retrospective multilevel model (rMLM) approach to analyze sibship data by using genotypic information as the dependent variable. Simulated data sets were generated using the simulation of linkage and association (SIMLA) program. We compared rMLM to sib transmission/disequilibrium test (S-TDT), sibling disequilibrium test (SDT), conditional logistic regression (CLR) and generalized estimation equations (GEE) on the measures of power, type I error, estimation bias and standard error. The results indicated that rMLM was a valid test of association in the presence of linkage using sibship data. The advantages of rMLM became more evident when the data contained concordant sibships. Compared to GEE, rMLM had less underestimated odds ratio (OR). Our results support the application of rMLM to detect gene-disease associations using sibship data. However, the risk of increasing type I error rate should be cautioned when there is association without linkage between the disease locus and the genotyped marker.  相似文献   

18.

Quantitative dynamical models facilitate the understanding of biological processes and the prediction of their dynamics. These models usually comprise unknown parameters, which have to be inferred from experimental data. For quantitative experimental data, there are several methods and software tools available. However, for qualitative data the available approaches are limited and computationally demanding. Here, we consider the optimal scaling method which has been developed in statistics for categorical data and has been applied to dynamical systems. This approach turns qualitative variables into quantitative ones, accounting for constraints on their relation. We derive a reduced formulation for the optimization problem defining the optimal scaling. The reduced formulation possesses the same optimal points as the established formulation but requires less degrees of freedom. Parameter estimation for dynamical models of cellular pathways revealed that the reduced formulation improves the robustness and convergence of optimizers. This resulted in substantially reduced computation times. We implemented the proposed approach in the open-source Python Parameter EStimation TOolbox (pyPESTO) to facilitate reuse and extension. The proposed approach enables efficient parameterization of quantitative dynamical models using qualitative data.

  相似文献   

19.
MOTIVATION: Sequence annotations, functional and structural data on snake venom neurotoxins (svNTXs) are scattered across multiple databases and literature sources. Sequence annotations and structural data are available in the public molecular databases, while functional data are almost exclusively available in the published articles. There is a need for a specialized svNTXs database that contains NTX entries, which are organized, well annotated and classified in a systematic manner. RESULTS: We have systematically analyzed svNTXs and classified them using structure-function groups based on their structural, functional and phylogenetic properties. Using conserved motifs in each phylogenetic group, we built an intelligent module for the prediction of structural and functional properties of unknown NTXs. We also developed an annotation tool to aid the functional prediction of newly identified NTXs as an additional resource for the venom research community. AVAILABILITY: We created a searchable online database of NTX proteins sequences (http://research.i2r.a-star.edu.sg/Templar/DB/snake_neurotoxin). This database can also be found under Swiss-Prot Toxin Annotation Project website (http://www.expasy.org/sprot/).  相似文献   

20.
Several methods have been developed to estimate the selfing rate of a population from a sample of individuals genotyped for several marker loci. These methods can be based on homozygosity excess (or inbreeding), identity disequilibrium, progeny array (PA) segregation or population assignment incorporating partial selfing. Progeny array-based method is generally the best because it is not subject to some assumptions made by other methods (such as lack of misgenotyping, absence of biparental inbreeding and presence of inbreeding equilibrium), and it can reveal other facets of a mixed-mating system such as patterns of shared paternity. However, in practice, it is often difficult to obtain PAs, especially for animal species. In this study, we propose a method to reconstruct the pedigree of a sample of individuals taken from a monoecious diploid population practicing mixed mating, using multilocus genotypic data. Selfing and outcrossing events are then detected when an individual derives from identical parents and from two distinct parents, respectively. Selfing rate is estimated by the proportion of selfed offspring in the reconstructed pedigree of a sample of individuals. The method enjoys many advantages of the PA method, but without the need of a priori family structure, although such information, if available, can be utilized to improve the inference. Furthermore, the new method accommodates genotyping errors, estimates allele frequencies jointly and is robust to the presence of biparental inbreeding and inbreeding disequilibrium. Both simulated and empirical data were analysed by the new and previous methods to compare their statistical properties and accuracies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号